In his 1987 paper (Cognitive Social Structures), Krackhardt includes data from a 21-member network of managers (all male) at a tech manufacturing company. These data include not only egocentric reports (these are the people I go to for advice), but also allocentric reports (I think John often asks Bill for advice). We can try testing whether multistep relational abstraction explains the subjects’ allocentric beliefs in this dataset.

Setup

# Set a random seed for reproducibility
set.seed(sum(utf8ToInt("krackhardt")))

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.2     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.4.3     ✔ tibble    3.2.1
## ✔ lubridate 1.9.2     ✔ tidyr     1.3.0
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(here)
## here() starts at /Users/jaeyoungson/Documents/GitHub/multistep-relational-abstraction
library(tidygraph)
## 
## Attaching package: 'tidygraph'
## 
## The following object is masked from 'package:stats':
## 
##     filter
library(ggraph)
library(patchwork)
library(lme4)
## Loading required package: Matrix
## 
## Attaching package: 'Matrix'
## 
## The following objects are masked from 'package:tidyr':
## 
##     expand, pack, unpack
library(lmerTest)
## 
## Attaching package: 'lmerTest'
## 
## The following object is masked from 'package:lme4':
## 
##     lmer
## 
## The following object is masked from 'package:stats':
## 
##     step
library(broom.mixed)

kable <- knitr::kable
vif <- car::vif

knitting <- knitr::is_html_output()

if (knitting) {
  # Create directories for saving stuff
  if (!dir.exists(here("outputs"))) {
    dir.create(here("outputs"))
  }
  
  if (!dir.exists(here("outputs", "07-krackhardt"))) {
    dir.create(here("outputs", "07-krackhardt"))
  }
}

# Pull in some modeling tools
source(here("code", "utils", "representation_utils.R"))
source(here("code", "utils", "network_utils.R"))

gg <- list(
  theme_bw(),
  theme(
    plot.title = element_text(hjust = 0.5, size = 13),
    plot.subtitle = element_text(hjust = 0.5, size = 11),
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank(),
    panel.spacing = unit(0.75, "lines"),
    legend.box.spacing = unit(0.5, "lines"),
    legend.margin = margin(c(0, 0, 0, 0), unit = "lines")
  )
)

plot_representation <- function(rep_df, relation_value, n_breaks = 3) {
  plot_df <- rep_df %>%
    filter(from != to) %>%
    group_by(from, to) %>%
    summarise(value = mean({{relation_value}}), .groups = "drop")
  
  plot_df %>%
    # Plot
    mutate(
      across(c(from, to), factor),
      from = fct_rev(from)
    ) %>%
    ggplot(aes(x=to, y=from, fill=value)) +
    gg +
    geom_tile() +
    coord_fixed() +
    scale_fill_viridis_c(
      option = "magma", name = NULL, na.value = "white", n.breaks = n_breaks
    ) +
    theme(
      legend.position = "bottom",
      axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1)
    )
}

check_significance <- function(tidy_stats) {
  tidy_stats %>%
    mutate(
      significance = case_when(
        p.value < 0.001 ~ "***",
        p.value < 0.01 ~ "**",
        p.value < 0.05 ~ "*",
        p.value < 0.1 ~ ".",
        TRUE ~ ""
      )
    )
}

These data are included in Krackhardt 1987, Appendix A. Each allocentric CSV contains an adjacency matrix, so we’ll need to do a little work to get them into a tidy-friendly format.

The “egocentric” network is defined in the way that social networks are most often defined: based on each subject’s answer to the question, “Who are you connected to?” Oftentimes, this question is phrased to measure friendships (e.g., “Are you friends with X?”). In this particular dataset, the question is something closer to “Who do you ask for advice?”

The “allocentric” cognitive maps measure each subject’s answer to the question, “Does X ask Y for advice?” This includes the special case of, “Does X ask you for advice?” Here’s the rationale for including the special case in allocentric maps. Imagine for a moment that this were a study about friendships instead of advice-giving. In that case, we would consider “Are you friends with X?” to be a question that subjects would know the answer to (or at least, they would know better than anyone else). On the other hand, we would consider the question “Does X consider you a friend?” to be more of an inference than knowledge. I don’t know whether X considers me a friend; I don’t consider X to be a friend, but we do occasionally eat lunch together. Moreover, in cases of disagreement (X says he asks Y for advice; Y doesn’t agree), the “ground truth” egocentric network would then hinge on the researcher’s decision to use an intersection vs union rule to resolve disagreements; Krackhardt notes this in his paper. By alternatively defining the egocentric network based solely on the responses that subjects would have greatest introspective access to (their own advice-seeking behavior), we sidestep this researcher degree of freedom.

krackhardt_raw <- here("data", "krackhardt") %>%
  fs::dir_ls(regexp = "krackhardt_tech_managers_sub[[:digit:]]{2}\\.csv") %>%
  map_dfr(
    .f = ~read_csv(.x, col_names = FALSE, show_col_types = FALSE),
    .id = "filename"
  ) %>%
  mutate(
    sub_id = str_extract(filename, "sub[[:digit:]]{2}"),
    sub_id = str_remove(sub_id, "sub"),
    sub_id = as.numeric(sub_id)
  ) %>%
  select(-filename) %>%
  group_by(sub_id) %>%
  mutate(from = row_number()) %>%
  ungroup() %>%
  pivot_longer(
    cols = -c(sub_id, from),
    names_to = "to",
    values_to = "edge"
  ) %>%
  mutate(
    to = str_remove(to, "X"),
    to = as.integer(to)
  )

krackhardt_egocentric <- krackhardt_raw %>%
  filter(sub_id == from, from != to) %>%
  select(-sub_id)

krackhardt_allocentric <- krackhardt_raw %>%
  filter(sub_id != from, from != to)

Plot data

Okay, let’s try and get a sense for what the “egocentric” network looks like.

krackhardt_g <- krackhardt_egocentric %>%
  filter(edge == 1) %>%
  select(-edge) %>% 
  tbl_graph(edges = .) %>%
  mutate(name = row_number())

plot_network_graph <- krackhardt_g %>%
  mutate(Degree = centrality_degree(mode = "in")) %>%
  # Plot the "ground truth" network
  ggraph(layout = "stress") +
  geom_edge_link(
    aes(
      start_cap = label_rect(node1.name, padding = margin(15, 15, 15, 15)),
      end_cap = label_rect(node2.name, padding = margin(15, 15, 15, 15)),
    ),
    arrow = arrow(length = unit(5, "pt"), type = "closed")
  ) +
  geom_node_label(
    aes(label = name, size = Degree),
    fill = "grey95", show.legend = FALSE
  ) +
  scale_size(range = c(3, 10)) +
  theme(
    panel.background = element_blank(),
    strip.background = element_blank(),
    panel.border = element_blank(),
    axis.title = element_blank(),
    axis.text = element_blank(),
    axis.ticks = element_blank(),
    plot.title = element_text(hjust = 0.5, size = 13),
    strip.text = element_text(hjust = 0.5, size = 13),
    legend.position = "bottom"
  )

plot_network_graph
## Warning: Using the `size` aesthetic in this geom was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` in the `default_aes` field and elsewhere instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated.

We can also plot this as a matrix instead of a graph, which will ease comparison with the cognitive strategies we’ll consider next.

plot_network_adj <- krackhardt_egocentric %>%
  plot_representation(edge, n_breaks = 2) +
  ggtitle("Egocentric Network")

plot_network_adj

How does this compare to subjects’ mental representation of the network?

plot_choice <- krackhardt_allocentric %>%
  plot_representation(relation_value = edge) +
  ggtitle("Network Representation")

plot_network_adj | plot_choice

In the original paper, Krackhardt makes the following note: “There are some centers of focus for advice, notably 2 and 21 (both are vice presidents) … In [the subjectively perceived network], 21 loses his prominence as a major recipient of nominations: instead, 18, 14, and 7 (the president) appear to be more central.” To get a quantitative sense for this in our data, we can try calculating the “perceived” degree centrality by summing over the matrix columns, then comparing that against the “true” in-degree centrality.

centrality_differences <- krackhardt_g %>%
  mutate(
    Degree = centrality_degree(mode = "in"),
    Betweenness = centrality_betweenness(directed = TRUE)
  ) %>%
  as_tibble() %>%
  # Add represented in-degree; standardize sum by number of network members
  left_join(
    krackhardt_allocentric %>%
      group_by(name = to) %>%
      summarise(Perceived = sum(edge)/21, .groups = "drop"),
    by = c("name")
  ) %>%
  select(name, Perceived, everything()) %>%
  arrange(desc(Degree), desc(Perceived))

We can see that the same number of managers go to subjects 18 and 21 for advice (15 each), but that subject 18 is perceived as being much more “popular” (averaged across subjects, 11.95 managers are perceived to go to him for advice) than subject 21 (averaged across subjects, 8.38 managers are perceived to go to him for advice). A similar phenomenon happens with subjects 1 and 7, where subject 1 is perceived as being much less popular than subject 7, despite their having the same “true” popularity.

centrality_differences %>%
  ggplot(aes(x=Degree, y=Perceived)) +
  gg +
  geom_smooth(method = "lm", se = FALSE) +
  geom_label(aes(label = name)) +
  scale_x_continuous(name = "True (degree) centrality") +
  scale_y_continuous(name = "Perceived centrality")
## `geom_smooth()` using formula = 'y ~ x'

centrality_differences %>%
  select(name, Degree, Perceived) %>%
  kable()
name Degree Perceived
2 18 13.047619
18 15 11.952381
21 15 8.380952
7 13 10.380952
1 13 5.095238
11 11 6.761905
14 10 11.428571
6 10 4.904762
8 10 4.666667
10 9 3.952381
17 9 3.809524
20 8 4.190476
4 8 4.000000
16 8 3.428571
12 7 3.190476
5 5 4.761905
3 5 4.476191
19 4 4.619048
15 4 2.809524
13 4 2.714286
9 4 2.619048

Is it possible that subjects’ network representations might be reflecting some other centrality metric? In the original paper, Krackhardt notes that betweenness seems to have some predictive power. For the 18-21 and 1-7 test cases, we see that betweenness centrality does make the “correct” prediction about which network member is represented as being more important. But, we should note that betweenness doesn’t do the greatest job predicting perceived popularity across the board; its greatest predictive power is in explaining the popularity discrepancy for the most popular managers.

centrality_differences %>%
  pivot_longer(cols = Degree:Betweenness, names_to = "metric") %>%
  ggplot(aes(x=value, y=Perceived)) +
  gg +
  facet_wrap(~metric, scales = "free_x") +
  geom_smooth(method = "lm", se = FALSE) +
  geom_label(aes(label = name)) +
  scale_x_continuous(name = "True centrality") +
  scale_y_continuous(name = "Perceived centrality")
## `geom_smooth()` using formula = 'y ~ x'

centrality_differences %>%
  kable()
name Perceived Degree Betweenness
2 13.047619 18 5.9357143
18 11.952381 15 88.9166667
21 8.380952 15 60.1269841
7 10.380952 13 27.6246032
1 5.095238 13 13.7468254
11 6.761905 11 1.1984127
14 11.428571 10 0.5888889
6 4.904762 10 0.0000000
8 4.666667 10 3.9746032
10 3.952381 9 18.2968254
17 3.809524 9 2.5317460
20 4.190476 8 7.9793651
4 4.000000 8 13.7087302
16 3.428571 8 0.7000000
12 3.190476 7 0.2539683
5 4.761905 5 5.0785714
3 4.476191 5 6.6047619
19 4.619048 4 0.7539683
15 2.809524 4 6.1325397
13 2.714286 4 0.8928571
9 2.619048 4 3.9539683

Simulated observations

We know that an individual’s embedding within a network dictates what interactions and/or relationships they’re able to observe. Therefore, when building predicted cognitive maps, we should use a different set of interactions to inform each subject’s predicted representations. Of course, we have no idea what individual subjects’ observations actually are. So we’ll need to come up with a compact and reasonable set of assumptions about what observations were most likely available to individual network members.

Here are three key assumptions we’ll make here:

  1. We’ll assume (trivially) that people can observe interactions between themselves and their immediate advisors (and/or advisees).
  2. We’ll also assume that people are generally able to observe interactions between advisors-of-advisors (and/or advisees-of-advisees), either through firsthand observation or secondhand chatter.
  3. Though of course it’s plausible that people might observe more distant interactions, we’ll assume this happens with enough infrequency that we can ignore those interactions for the sake of modeling.

We’ve left open an ambiguity in those assumptions: are people observing advisors-of-advisors, and/or advisees-of-advisees? To aid understanding, we’ll use the following example. Suppose we’re looking at a subset of a network consisting of the grad student Abby, who asks the postdoc Betty for advice; in turn, Betty the postdoc asks Cathy the assistant professor for advice; and Cathy asks Dani the associate professor for advice. There are three reasonable ways that observations might work in this network:

  1. Observations go “out”: When Abby asks Betty for advice, Betty responds by sharing the wisdom she’s gained from asking Cathy for advice. That leads Abby to observe that Betty goes to Cathy for advice.
  2. Observations go “in”: Betty has asked Cathy for advice. Cathy thinks Betty has asked a great question that she doesn’t know the answer to, so Cathy asks Dani about it. That leads Dani to observe that Betty goes to Cathy for advice.
  3. It goes both ways.

We may be able to test this empirically. We would need to simulate a set of observations consistent with all three possibilities, generate model predictions from those simulations, then compare how well those models fit managers’ actual perceptions of the network.

# Observations go both ways
obs_edgelist_all <- expand_grid(
  sub_id = 1:21,
  # Start with all one-step egocentric observations
  krackhardt_egocentric
) %>%
  # Convert adjlist into edgelist
  filter(edge == 1) %>%
  select(-edge) %>%
  # For a given node, we want to know who their "advisors-of-advisors" are
  # (or equivalently, "advisees-of-advisees")
  left_join(
    krackhardt_g %>%
      mutate(neighbors = local_members(order = 1, mode = "all")) %>%
      as_tibble(),
    by = c("sub_id" = "name")
  ) %>%
  # We assume that nodes are able to observe (directly or through chatter)
  # advice-giving interactions from their advisors and advisors-of-advisors
  # (or equivalently, advisees and advisees-of-advisees)
  rowwise() %>%
  filter((from %in% neighbors) | (to %in% neighbors)) %>%
  ungroup() %>%
  select(-neighbors) %>%
  # Make pretty
  arrange(sub_id, from)

# Advisees observe who advises their advisors
obs_edgelist_out <- expand_grid(
  sub_id = 1:21,
  krackhardt_egocentric
) %>%
  filter(edge == 1) %>%
  select(-edge) %>%
  left_join(
    krackhardt_g %>%
      mutate(neighbors = local_members(order = 1, mode = "out")) %>%
      as_tibble(),
    by = c("sub_id" = "name")
  ) %>%
  rowwise() %>%
  filter((from %in% neighbors) | (to %in% neighbors)) %>%
  ungroup() %>%
  select(-neighbors) %>%
  arrange(sub_id, from)

# Advisors observe who their advisees advise
obs_edgelist_in <- expand_grid(
  sub_id = 1:21,
  krackhardt_egocentric
) %>%
  filter(edge == 1) %>%
  select(-edge) %>%
  left_join(
    krackhardt_g %>%
      mutate(neighbors = local_members(order = 1, mode = "in")) %>%
      as_tibble(),
    by = c("sub_id" = "name")
  ) %>%
  rowwise() %>%
  filter((from %in% neighbors) | (to %in% neighbors)) %>%
  ungroup() %>%
  select(-neighbors) %>%
  arrange(sub_id, from)

obs_adjlist_all <- expand_grid(sub_id = 1:21, from = sub_id, to = sub_id) %>%
  left_join(
    obs_edgelist_all %>% mutate(edge = 1),
    by = c("sub_id", "from", "to")
  ) %>%
  mutate(edge = if_else(is.na(edge), 0, edge))

obs_adjlist_out <- expand_grid(sub_id = 1:21, from = sub_id, to = sub_id) %>%
  left_join(
    obs_edgelist_out %>% mutate(edge = 1),
    by = c("sub_id", "from", "to")
  ) %>%
  mutate(edge = if_else(is.na(edge), 0, edge))

obs_adjlist_in <- expand_grid(sub_id = 1:21, from = sub_id, to = sub_id) %>%
  left_join(
    obs_edgelist_in %>% mutate(edge = 1),
    by = c("sub_id", "from", "to")
  ) %>%
  mutate(edge = if_else(is.na(edge), 0, edge))

Schema-based representation

If people were using triad or community schemas/heuristics to inform representation, what would this look like? Using literally the exact same tools that we used to analyze the butterfly network, let’s create some predicted representations.

schema_experts_all <- obs_adjlist_all %>%
  # Memorization
  group_by(sub_id, from) %>%
  mutate(
    mem_value = edge / sum(edge),
    mem_value = if_else(is.nan(mem_value), 0, mem_value)
  ) %>%
  ungroup() %>%
  # Now triads
  left_join(
    obs_adjlist_all %>%
      group_by(sub_id) %>%
      nest() %>%
      mutate(triad = map(data, ~build_rep_triad(.x, "edge"))) %>%
      unnest(triad) %>%
      ungroup() %>%
      select(-c(data, transition_scaling)) %>%
      mutate(triad_value = if_else(is.nan(triad_value), 0, triad_value)),
    by = c("sub_id", "from", "to")
  ) %>%
  # Finally communities
  left_join(
    obs_adjlist_all %>%
      group_by(sub_id) %>%
      nest() %>%
      mutate(comm = map(data, ~build_rep_community(.x, "edge"))) %>%
      unnest(comm) %>%
      ungroup() %>%
      select(-c(data, transition_scaling)),
    by = c("sub_id", "from", "to")
  )

schema_experts_out <- obs_adjlist_out %>%
  # Memorization
  group_by(sub_id, from) %>%
  mutate(
    mem_value = edge / sum(edge),
    mem_value = if_else(is.nan(mem_value), 0, mem_value)
  ) %>%
  ungroup() %>%
  # Now triads
  left_join(
    obs_adjlist_out %>%
      group_by(sub_id) %>%
      nest() %>%
      mutate(triad = map(data, ~build_rep_triad(.x, "edge"))) %>%
      unnest(triad) %>%
      ungroup() %>%
      select(-c(data, transition_scaling)) %>%
      mutate(triad_value = if_else(is.nan(triad_value), 0, triad_value)),
    by = c("sub_id", "from", "to")
  ) %>%
  # Finally communities
  left_join(
    obs_adjlist_out %>%
      group_by(sub_id) %>%
      nest() %>%
      mutate(comm = map(data, ~build_rep_community(.x, "edge"))) %>%
      unnest(comm) %>%
      ungroup() %>%
      select(-c(data, transition_scaling)),
    by = c("sub_id", "from", "to")
  )

schema_experts_in <- obs_adjlist_in %>%
  # Memorization
  group_by(sub_id, from) %>%
  mutate(
    mem_value = edge / sum(edge),
    mem_value = if_else(is.nan(mem_value), 0, mem_value)
  ) %>%
  ungroup() %>%
  # Now triads
  left_join(
    obs_adjlist_in %>%
      group_by(sub_id) %>%
      nest() %>%
      mutate(triad = map(data, ~build_rep_triad(.x, "edge"))) %>%
      unnest(triad) %>%
      ungroup() %>%
      select(-c(data, transition_scaling)) %>%
      mutate(triad_value = if_else(is.nan(triad_value), 0, triad_value)),
    by = c("sub_id", "from", "to")
  ) %>%
  # Finally communities
  left_join(
    obs_adjlist_in %>%
      group_by(sub_id) %>%
      nest() %>%
      mutate(comm = map(data, ~build_rep_community(.x, "edge"))) %>%
      unnest(comm) %>%
      ungroup() %>%
      select(-c(data, transition_scaling)),
    by = c("sub_id", "from", "to")
  )
plot_mem_all <- plot_representation(schema_experts_all, mem_value) +
  ggtitle("Memorization")

plot_triad_all <- plot_representation(schema_experts_all, triad_value) +
  ggtitle("Triad Completion")

plot_comm_all <- plot_representation(schema_experts_all, comm_value) +
  ggtitle("Community Detection")

plot_schemas_all <- (
  plot_choice | plot_mem_all | plot_triad_all | plot_comm_all
) +
  plot_annotation(
    title = "Observations of advice-giving go both ways",
    theme = theme(plot.title = element_text(size = 15, hjust = 0.5))
  )

plot_schemas_all

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "schema_gallery_all.pdf"),
    plot_schemas_all,
    width = 12, height = 4,
    device = cairo_pdf, units = "in", dpi = 300
  )
}
plot_mem_out <- plot_representation(schema_experts_out, mem_value) +
  ggtitle("Memorization")

plot_triad_out <- plot_representation(schema_experts_out, triad_value) +
  ggtitle("Triad Completion")

plot_comm_out <- plot_representation(schema_experts_out, comm_value) +
  ggtitle("Community Detection")

plot_schemas_out <- (
  plot_choice | plot_mem_out | plot_triad_out | plot_comm_out
) +
  plot_annotation(
    title = "Advisees observe advisors' advisors",
    theme = theme(plot.title = element_text(size = 15, hjust = 0.5))
  )

plot_schemas_out

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "schema_gallery_out.pdf"),
    plot_schemas_out,
    width = 12, height = 4,
    device = cairo_pdf, units = "in", dpi = 300
  )
}
plot_mem_in <- plot_representation(schema_experts_in, mem_value) +
  ggtitle("Memorization")

plot_triad_in <- plot_representation(schema_experts_in, triad_value) +
  ggtitle("Triad Completion")

plot_comm_in <- plot_representation(schema_experts_in, comm_value) +
  ggtitle("Community Detection")

plot_schemas_in <- (
  plot_choice | plot_mem_in | plot_triad_in | plot_comm_in
) +
  plot_annotation(
    title = "Advisors observe advisees' advisees",
    theme = theme(plot.title = element_text(size = 15, hjust = 0.5))
  )

plot_schemas_in

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "schema_gallery_in.pdf"),
    plot_schemas_in,
    width = 12, height = 4,
    device = cairo_pdf, units = "in", dpi = 300
  )
}

At the group-level… schemas make frankly terrible predictions, no matter what we assume about observations. Note that if a particular subject is predicted to ask all other managers for advice, the row associated with that subject will be filled with a value close to 1/20=0.05. So we can see that the triad completion strategy predicts that most managers are connected, and the community detection strategy predicts that literally all managers are connected. Neither comes even a little bit close to the actual allocentric network representation.

Successor repesentation

Let’s say people use multistep relational abstraction to represent networks, and let’s use the SR to approximate that kind of abstraction. Using the observations we generated before, let’s create lots of learning “trials” to train the algorithm.

sr_observations_all <- obs_edgelist_all %>%
  # Create simulated "learning trials" to train an asymptotic SR
  expand_grid(rep = 1:100) %>%
  # Randomize "order of trials" per repetition
  group_by(sub_id, rep) %>%
  slice_sample(prop = 1) %>%
  ungroup()

sr_observations_out <- obs_edgelist_out %>%
  expand_grid(rep = 1:100) %>%
  group_by(sub_id, rep) %>%
  slice_sample(prop = 1) %>%
  ungroup()

sr_observations_in <- obs_edgelist_in %>%
  expand_grid(rep = 1:100) %>%
  group_by(sub_id, rep) %>%
  slice_sample(prop = 1) %>%
  ungroup()

And now we can use those to build SRs at different scales. We’ll assume that every subject is using the same successor horizon…

sr_all <- sr_observations_all %>%
  expand_grid(gamma = seq(0.1, 0.9, 0.1), .) %>%
  group_by(sub_id, gamma) %>%
  nest() %>%
  mutate(
    sr = map2(
      .x = data, .y = gamma,
      ~build_rep_sr(.x, this_alpha = 0.1, this_gamma = .y)
    )
  ) %>%
  unnest(sr) %>%
  ungroup() %>%
  select(-data) %>%
  mutate(gamma = gamma * 10) %>%
  pivot_wider(
    names_from = gamma,
    names_prefix = "sr",
    values_from = sr_value
  )

sr_out <- sr_observations_out %>%
  expand_grid(gamma = seq(0.1, 0.9, 0.1), .) %>%
  group_by(sub_id, gamma) %>%
  nest() %>%
  mutate(
    sr = map2(
      .x = data, .y = gamma,
      ~build_rep_sr(.x, this_alpha = 0.1, this_gamma = .y)
    )
  ) %>%
  unnest(sr) %>%
  ungroup() %>%
  select(-data) %>%
  mutate(gamma = gamma * 10) %>%
  pivot_wider(
    names_from = gamma,
    names_prefix = "sr",
    values_from = sr_value
  )

sr_in <- sr_observations_in %>%
  expand_grid(gamma = seq(0.1, 0.9, 0.1), .) %>%
  group_by(sub_id, gamma) %>%
  nest() %>%
  mutate(
    sr = map2(
      .x = data, .y = gamma,
      ~build_rep_sr(.x, this_alpha = 0.1, this_gamma = .y)
    )
  ) %>%
  unnest(sr) %>%
  ungroup() %>%
  select(-data) %>%
  mutate(gamma = gamma * 10) %>%
  pivot_wider(
    names_from = gamma,
    names_prefix = "sr",
    values_from = sr_value
  )

Let’s get a sense for the predictions when the SR is trained on observations that go both ways.

plot_sr1_all <- sr_all %>%
  plot_representation(relation_value = sr1, n_breaks = 4) +
  ggtitle("\u03B3 = 0.1")

plot_sr3_all <- sr_all %>%
  plot_representation(relation_value = sr3) +
  ggtitle("\u03B3 = 0.3")

plot_sr5_all <- sr_all %>%
  plot_representation(relation_value = sr5) +
  ggtitle("\u03B3 = 0.5")

plot_sr7_all <- sr_all %>%
  plot_representation(relation_value = sr7) +
  ggtitle("\u03B3 = 0.7")

plot_sr9_all <- sr_all %>%
  plot_representation(relation_value = sr9) +
  ggtitle("\u03B3 = 0.9")

plot_sr_gallery_all <- (
  plot_choice +
    plot_sr1_all + plot_sr3_all + plot_sr5_all + plot_sr7_all + plot_sr9_all
)

plot_sr_gallery_all

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "sr_gallery_all.pdf"),
    plot_sr_gallery_all,
    width = 12, height = 8,
    device = cairo_pdf, units = "in", dpi = 300
  )
}

And now where advisees observe their advisors’ advisors…

plot_sr1_out <- sr_out %>%
  plot_representation(relation_value = sr1, n_breaks = 4) +
  ggtitle("\u03B3 = 0.1")

plot_sr3_out <- sr_out %>%
  plot_representation(relation_value = sr3) +
  ggtitle("\u03B3 = 0.3")

plot_sr5_out <- sr_out %>%
  plot_representation(relation_value = sr5) +
  ggtitle("\u03B3 = 0.5")

plot_sr7_out <- sr_out %>%
  plot_representation(relation_value = sr7) +
  ggtitle("\u03B3 = 0.7")

plot_sr9_out <- sr_out %>%
  plot_representation(relation_value = sr9) +
  ggtitle("\u03B3 = 0.9")

plot_sr_gallery_out <- (
  plot_choice +
    plot_sr1_out + plot_sr3_out + plot_sr5_out + plot_sr7_out + plot_sr9_out
)

plot_sr_gallery_out

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "sr_gallery_out.pdf"),
    plot_sr_gallery_out,
    width = 12, height = 8,
    device = cairo_pdf, units = "in", dpi = 300
  )
}

And finally where advisors observe their advisees’ advisees…

plot_sr1_in <- sr_in %>%
  plot_representation(relation_value = sr1, n_breaks = 4) +
  ggtitle("\u03B3 = 0.1")

plot_sr3_in <- sr_in %>%
  plot_representation(relation_value = sr3) +
  ggtitle("\u03B3 = 0.3")

plot_sr5_in <- sr_in %>%
  plot_representation(relation_value = sr5) +
  ggtitle("\u03B3 = 0.5")

plot_sr7_in <- sr_in %>%
  plot_representation(relation_value = sr7) +
  ggtitle("\u03B3 = 0.7")

plot_sr9_in <- sr_in %>%
  plot_representation(relation_value = sr9) +
  ggtitle("\u03B3 = 0.9")

plot_sr_gallery_in <- (
  plot_choice +
    plot_sr1_in + plot_sr3_in + plot_sr5_in + plot_sr7_in + plot_sr9_in
)

plot_sr_gallery_in

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "sr_gallery_in.pdf"),
    plot_sr_gallery_in,
    width = 12, height = 8,
    device = cairo_pdf, units = "in", dpi = 300
  )
}

Statistical tests

Okay, now that we have all of the predicted strategies in hand, we’ll want to perform some formal statistical tests to verify that the SR is truly doing the best at explaining subjects’ network representations. Let’s create a dataframe with all of the info we’ll need.

test_dataset_all <- krackhardt_allocentric %>%
  rename(choice = edge) %>%
  left_join(schema_experts_all, by = c("sub_id", "from", "to")) %>%
  left_join(sr_all, by = c("sub_id", "from", "to"))

test_dataset_out <- krackhardt_allocentric %>%
  rename(choice = edge) %>%
  left_join(schema_experts_out, by = c("sub_id", "from", "to")) %>%
  left_join(sr_out, by = c("sub_id", "from", "to"))

test_dataset_in <- krackhardt_allocentric %>%
  rename(choice = edge) %>%
  left_join(schema_experts_in, by = c("sub_id", "from", "to")) %>%
  left_join(sr_in, by = c("sub_id", "from", "to"))

We’ll do all of the statistical tests for the “observations go both ways” models. Note that we can’t actually estimate a model for community detection, as this strategy predicts that everyone is connected to everyone.

test_mem_all <- test_dataset_all %>%
  glmer(
    choice ~ mem_value + (1 + mem_value | sub_id),
    family = "binomial", data = .
  )

test_triad_all <- test_dataset_all %>%
  glmer(
    choice ~ mem_value + triad_value + (1 + triad_value | sub_id),
    family = "binomial", data = .
  )

# test_comm_all <- test_dataset_all %>%
#   glmer(
#     choice ~ mem_value + comm_value + (1 + comm_value | sub_id),
#     family = "binomial", data = .
#   )

test_sr1_all <- test_dataset_all %>%
  glmer(
    choice ~ mem_value + sr1 + (1 + sr1 | sub_id),
    family = "binomial", data = .
  )

test_sr2_all <- test_dataset_all %>%
  glmer(
    choice ~ mem_value + sr2 + (1 + sr2 | sub_id),
    family = "binomial", data = .
  )

test_sr3_all <- test_dataset_all %>%
  glmer(
    choice ~ mem_value + sr3 + (1 + sr3 | sub_id),
    family = "binomial", data = .
  )

test_sr4_all <- test_dataset_all %>%
  glmer(
    choice ~ mem_value + sr4 + (1 + sr4 | sub_id),
    family = "binomial", data = .
  )

test_sr5_all <- test_dataset_all %>%
  glmer(
    choice ~ mem_value + sr5 + (1 + sr5 | sub_id),
    family = "binomial", data = .
  )

test_sr6_all <- test_dataset_all %>%
  glmer(
    choice ~ mem_value + sr6 + (1 + sr6 | sub_id),
    family = "binomial", data = .
  )

test_sr7_all <- test_dataset_all %>%
  glmer(
    choice ~ mem_value + sr7 + (1 + sr7 | sub_id),
    family = "binomial", data = .
  )

test_sr8_all <- test_dataset_all %>%
  glmer(
    choice ~ mem_value + sr8 + (1 + sr8 | sub_id),
    family = "binomial", data = .
  )

test_sr9_all <- test_dataset_all %>%
  glmer(
    choice ~ mem_value + sr9 + (1 + sr9 | sub_id),
    family = "binomial", data = .
  )

Moving on to the “advisees observe advisors’ advisors” models.

test_mem_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + (1 + mem_value | sub_id),
    family = "binomial", data = .
  )

test_triad_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + triad_value + (1 + triad_value | sub_id),
    family = "binomial", data = .
  )

test_comm_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + comm_value + (1 + comm_value | sub_id),
    family = "binomial", data = .
  )

test_sr1_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + sr1 + (1 + sr1 | sub_id),
    family = "binomial", data = .
  )

test_sr2_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + sr2 + (1 + sr2 | sub_id),
    family = "binomial", data = .
  )

test_sr3_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + sr3 + (1 + sr3 | sub_id),
    family = "binomial", data = .
  )

test_sr4_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + sr4 + (1 + sr4 | sub_id),
    family = "binomial", data = .
  )

test_sr5_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + sr5 + (1 + sr5 | sub_id),
    family = "binomial", data = .
  )

test_sr6_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + sr6 + (1 + sr6 | sub_id),
    family = "binomial", data = .
  )

test_sr7_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + sr7 + (1 + sr7 | sub_id),
    family = "binomial", data = .
  )

test_sr8_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + sr8 + (1 + sr8 | sub_id),
    family = "binomial", data = .
  )

test_sr9_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + sr9 + (1 + sr9 | sub_id),
    family = "binomial", data = .
  )

Finally, the “advisors observe advisees’ advisees” models. Once again, we can’t estimate a model for community detection, as this strategy predicts that everyone is connected to everyone.

test_mem_in <- test_dataset_in %>%
  glmer(
    choice ~ mem_value + (1 + mem_value | sub_id),
    family = "binomial", data = .
  )

test_triad_in <- test_dataset_in %>%
  glmer(
    choice ~ mem_value + triad_value + (1 + triad_value | sub_id),
    family = "binomial", data = .
  )

# test_comm_in <- test_dataset_in %>%
#   glmer(
#     choice ~ mem_value + comm_value + (1 + comm_value | sub_id),
#     family = "binomial", data = .
#   )

test_sr1_in <- test_dataset_in %>%
  glmer(
    choice ~ mem_value + sr1 + (1 + sr1 | sub_id),
    family = "binomial", data = .
  )

test_sr2_in <- test_dataset_in %>%
  glmer(
    choice ~ mem_value + sr2 + (1 + sr2 | sub_id),
    family = "binomial", data = .
  )

test_sr3_in <- test_dataset_in %>%
  glmer(
    choice ~ mem_value + sr3 + (1 + sr3 | sub_id),
    family = "binomial", data = .
  )

test_sr4_in <- test_dataset_in %>%
  glmer(
    choice ~ mem_value + sr4 + (1 + sr4 | sub_id),
    family = "binomial", data = .
  )

test_sr5_in <- test_dataset_in %>%
  glmer(
    choice ~ mem_value + sr5 + (1 + sr5 | sub_id),
    family = "binomial", data = .
  )

test_sr6_in <- test_dataset_in %>%
  glmer(
    choice ~ mem_value + sr6 + (1 + sr6 | sub_id),
    family = "binomial", data = .
  )

test_sr7_in <- test_dataset_in %>%
  glmer(
    choice ~ mem_value + sr7 + (1 + sr7 | sub_id),
    family = "binomial", data = .
  )

test_sr8_in <- test_dataset_in %>%
  glmer(
    choice ~ mem_value + sr8 + (1 + sr8 | sub_id),
    family = "binomial", data = .
  )

test_sr9_in <- test_dataset_in %>%
  glmer(
    choice ~ mem_value + sr9 + (1 + sr9 | sub_id),
    family = "binomial", data = .
  )

Of all of these models, which has the “best” fit, based on BIC?

# Which has the best "goodness-of-fit"?
goodness_of_fit <- bind_rows(
  # All: goes both ways
  glance(test_mem_all) %>% mutate(model = "mem_all"),
  glance(test_triad_all) %>% mutate(model = "triad_all"),
  glance(test_sr1_all) %>% mutate(model = "sr1_all"),
  glance(test_sr2_all) %>% mutate(model = "sr2_all"),
  glance(test_sr3_all) %>% mutate(model = "sr3_all"),
  glance(test_sr4_all) %>% mutate(model = "sr4_all"),
  glance(test_sr5_all) %>% mutate(model = "sr5_all"),
  glance(test_sr6_all) %>% mutate(model = "sr6_all"),
  glance(test_sr7_all) %>% mutate(model = "sr7_all"),
  glance(test_sr8_all) %>% mutate(model = "sr8_all"),
  glance(test_sr9_all) %>% mutate(model = "sr9_all"),
  # Out: advisors of advisors
  glance(test_mem_out) %>% mutate(model = "mem_out"),
  glance(test_triad_out) %>% mutate(model = "triad_out"),
  glance(test_comm_out) %>% mutate(model = "comm_out"),
  glance(test_sr1_out) %>% mutate(model = "sr1_out"),
  glance(test_sr2_out) %>% mutate(model = "sr2_out"),
  glance(test_sr3_out) %>% mutate(model = "sr3_out"),
  glance(test_sr4_out) %>% mutate(model = "sr4_out"),
  glance(test_sr5_out) %>% mutate(model = "sr5_out"),
  glance(test_sr6_out) %>% mutate(model = "sr6_out"),
  glance(test_sr7_out) %>% mutate(model = "sr7_out"),
  glance(test_sr8_out) %>% mutate(model = "sr8_out"),
  glance(test_sr9_out) %>% mutate(model = "sr9_out"),
  # In: advisees of advisees
  glance(test_mem_in) %>% mutate(model = "mem_in"),
  glance(test_triad_in) %>% mutate(model = "triad_in"),
  glance(test_sr1_in) %>% mutate(model = "sr1_in"),
  glance(test_sr2_in) %>% mutate(model = "sr2_in"),
  glance(test_sr3_in) %>% mutate(model = "sr3_in"),
  glance(test_sr4_in) %>% mutate(model = "sr4_in"),
  glance(test_sr5_in) %>% mutate(model = "sr5_in"),
  glance(test_sr6_in) %>% mutate(model = "sr6_in"),
  glance(test_sr7_in) %>% mutate(model = "sr7_in"),
  glance(test_sr8_in) %>% mutate(model = "sr8_in"),
  glance(test_sr9_in) %>% mutate(model = "sr9_in"),
) %>%
  select(model, everything()) %>%
  arrange(BIC) %>%
  # Make this a little more human-readable
  separate(model, into = c("model", "direction")) %>%
  mutate(
    direction = case_when(
      direction == "out" ~ "advisee observes advisor of advisor",
      direction == "in" ~ "advisor observes advisee of advisee",
      direction == "all" ~ "both ways"
    )
  )

goodness_of_fit %>%
  mutate(
    model = fct_relevel(
      model,
      "sr1", "sr2", "sr3", "sr4", "sr5", "sr6", "sr7", "sr8", "sr9",
      "mem", "triad", "comm"
    )
  ) %>%
  ggplot(aes(x = model, y = BIC, color = direction)) +
  gg +
  geom_point() +
  ggtitle("Model goodness-of-fit") +
  theme(legend.position = "bottom")

goodness_of_fit %>%
  kable()
model direction nobs sigma logLik AIC BIC deviance df.residual
sr7 advisee observes advisor of advisor 8400 1 -4363.029 8738.058 8780.274 8565.413 8394
sr8 advisee observes advisor of advisor 8400 1 -4363.853 8739.706 8781.922 8566.508 8394
sr6 advisee observes advisor of advisor 8400 1 -4369.466 8750.931 8793.147 8580.404 8394
sr9 advisee observes advisor of advisor 8400 1 -4375.204 8762.408 8804.624 8592.060 8394
sr5 advisee observes advisor of advisor 8400 1 -4379.140 8770.281 8812.497 8602.266 8394
sr4 advisee observes advisor of advisor 8400 1 -4389.390 8790.781 8832.996 8625.097 8394
sr8 both ways 8400 1 -4398.735 8809.469 8851.685 8657.019 8394
sr7 both ways 8400 1 -4398.873 8809.746 8851.962 8657.895 8394
sr3 advisee observes advisor of advisor 8400 1 -4398.903 8809.805 8852.021 8646.054 8394
sr6 both ways 8400 1 -4403.292 8818.583 8860.799 8668.298 8394
sr9 both ways 8400 1 -4406.178 8824.357 8866.573 8673.713 8394
sr2 advisee observes advisor of advisor 8400 1 -4407.193 8826.387 8868.603 8664.134 8394
sr5 both ways 8400 1 -4409.244 8830.487 8872.703 8681.930 8394
sr1 advisee observes advisor of advisor 8400 1 -4414.189 8840.378 8882.594 8679.243 8394
sr4 both ways 8400 1 -4415.308 8842.615 8884.831 8695.618 8394
sr3 both ways 8400 1 -4420.912 8853.824 8896.040 8708.120 8394
triad advisee observes advisor of advisor 8400 1 -4422.288 8856.576 8898.791 8721.435 8394
sr2 both ways 8400 1 -4425.886 8863.773 8905.989 8719.092 8394
sr1 both ways 8400 1 -4430.221 8872.442 8914.658 8728.545 8394
triad both ways 8400 1 -4431.553 8875.106 8917.322 8740.564 8394
mem advisee observes advisor of advisor 8400 1 -4444.995 8899.991 8935.171 8763.004 8395
mem both ways 8400 1 -4452.356 8914.713 8949.893 8775.496 8395
comm advisee observes advisor of advisor 8400 1 -4461.317 8934.634 8976.850 8821.487 8394
mem advisor observes advisee of advisee 8400 1 -4590.630 9191.259 9226.439 9024.731 8395
sr7 advisor observes advisee of advisee 8400 1 -4591.253 9194.506 9236.722 9021.134 8394
triad advisor observes advisee of advisee 8400 1 -4591.808 9195.615 9237.831 9044.740 8394
sr6 advisor observes advisee of advisee 8400 1 -4591.998 9195.997 9238.213 9024.658 8394
sr8 advisor observes advisee of advisee 8400 1 -4593.417 9198.834 9241.050 9024.487 8394
sr5 advisor observes advisee of advisee 8400 1 -4594.073 9200.146 9242.362 9031.037 8394
sr4 advisor observes advisee of advisee 8400 1 -4596.625 9205.251 9247.467 9038.234 8394
sr3 advisor observes advisee of advisee 8400 1 -4599.254 9210.508 9252.724 9045.352 8394
sr9 advisor observes advisee of advisee 8400 1 -4600.464 9212.929 9255.145 9040.442 8394
sr2 advisor observes advisee of advisee 8400 1 -4601.778 9215.556 9257.772 9052.026 8394
sr1 advisor observes advisee of advisee 8400 1 -4604.117 9220.234 9262.450 9058.124 8394

Let’s take a look at the best-fitting model estimates:

test_sr7_out %>%
  tidy(conf.int = TRUE) %>%
  kable()
effect group term estimate std.error statistic p.value conf.low conf.high
fixed NA (Intercept) -1.8725110 0.1852784 -10.106474 0 -2.235650 -1.509372
fixed NA mem_value 3.5286019 0.4025315 8.766027 0 2.739655 4.317549
fixed NA sr7 14.7453961 2.3390595 6.303985 0 10.160924 19.329868
ran_pars sub_id sd__(Intercept) 0.8095989 NA NA NA NA NA
ran_pars sub_id cor__(Intercept).sr7 -0.2827640 NA NA NA NA NA
ran_pars sub_id sd__sr7 9.5523177 NA NA NA NA NA
vif(test_sr7_out)
## mem_value       sr7 
##  1.081353  1.081353

SR centrality

Can the best-fitting SR explain an individual’s perceptions of other network members’ centrality?

sr_centrality <- sr_out %>%
  # SR centrality from best variant
  filter(from != to) %>%
  group_by(sub_id, node_id = to) %>%
  summarise(sr_centrality = mean(sr7), .groups = "drop") %>%
  # Perceived centrality
  left_join(
    krackhardt_allocentric %>%
      group_by(sub_id, node_id = to) %>%
      summarise(perceived_centrality = mean(edge), .groups = "drop"),
    by = c("sub_id", "node_id")
  )

To answer this, we’ll do a random-effects analysis where we take each individual’s idiosyncratic mental representation of their network, then rank-correlate it with the SR’s individual-level predictions.

sr_centrality_corr <- sr_centrality %>%
  group_by(sub_id) %>%
  nest() %>%
  mutate(
    spearman = map_dbl(
      .x = data,
      .f = ~with(
        .x, cor(sr_centrality, perceived_centrality, method = "spearman")
      )
    )
  ) %>%
  ungroup() %>%
  select(-data)

sr_centrality_corr %>%
  mutate(spearman = tanh(spearman)) %>%
  select(spearman) %>%
  deframe() %>%
  t.test() %>%
  tidy() %>%
  kable(
    caption = str_c(
      "t-test of the average Spearman correlation between perceived and ",
      "SR-predicted centrality (z-transformed)"
    )
  )
t-test of the average Spearman correlation between perceived and SR-predicted centrality (z-transformed)
estimate statistic p.value parameter conf.low conf.high method alternative
0.3942865 14.28888 0 20 0.3367265 0.4518465 One Sample t-test two.sided
sr_centrality_corr %>%
  summarise(mean_spearman = mean(spearman)) %>%
  kable()
mean_spearman
0.4257979

Let’s try to visualize these results in an intuitive format.

plot_sr_centrality_group <- sr_centrality %>%
  # Get each subject's rank-ordered perceived and SR-predicted centrality
  group_by(sub_id) %>%
  mutate(across(ends_with("_centrality"), ~rank(.x, ties.method = "min"))) %>%
  ungroup() %>%
  # Average across subjects to get group-level results
  group_by(node_id) %>%
  summarise(across(ends_with("_centrality"), mean)) %>%
  # For color-coding two pairs of managers with equivalent degree centrality,
  # but different perceived centrality
  mutate(
    colorcode = case_when(
      node_id %in% c(1, 7) ~ "a",
      node_id %in% c(18, 21) ~ "b"
    )
  ) %>%
  ggplot(aes(x=perceived_centrality, y=sr_centrality)) +
  gg +
  geom_smooth(method = "lm", se = FALSE, color = "black") +
  geom_label(aes(label = node_id, fill = colorcode), show.legend = FALSE) +
  scale_fill_manual(values = c("#a6cee3", "#b2df8a"), na.value = "grey95") +
  scale_x_continuous(name = "Perceived centrality (rank-ordered)") +
  scale_y_continuous(name = "SR centrality (rank-ordered)") +
  ggtitle("Perceived vs Successor Centrality (Group-Level)")

plot_sr_centrality_group
## `geom_smooth()` using formula = 'y ~ x'

plot_sr_centrality_individual <- sr_centrality %>%
  # Get each subject's rank-ordered perceived and SR-predicted centrality
  group_by(sub_id) %>%
  mutate(across(ends_with("_centrality"), ~rank(.x, ties.method = "min"))) %>%
  ungroup() %>%
  # Formatting for plotting
  pivot_longer(
    ends_with("_centrality"),
    names_to = "metric",
    values_to = "centrality"
  ) %>%
  mutate(
    node_id = factor(node_id),
    sub_id = str_pad(sub_id, width = 2, side = "left", pad = "0"),
    sub_id = str_c("Manager ", sub_id)
  ) %>%
  ggplot(aes(x=node_id, y=centrality, fill=metric)) +
  facet_wrap(~sub_id, nrow = 3) +
  geom_col(position = "identity", color = "black", alpha = 0.5) +
  coord_polar() +
  scale_fill_viridis_d(
    name = "Centrality",
    labels = c("Perceived (Empirical)", "Successor Representation (Simulated)")
  ) +
  ggtitle("Perceived vs Successor Centrality (Individual-Level)") +
  theme_bw() +
  theme(
    plot.title = element_text(hjust = 0.5, size = 13),
    plot.subtitle = element_text(hjust = 0.5, size = 11),
    panel.grid.minor = element_blank(),
    panel.spacing = unit(0.75, "lines"),
    legend.box.spacing = unit(0.5, "lines"),
    legend.margin = margin(c(0, 0, 0, 0), unit = "lines"),
    legend.position = "bottom",
    axis.title.x = element_blank(),
    axis.title.y = element_blank(),
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank()
  )

plot_sr_centrality_individual

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "sr_centrality_individual_standard.pdf"),
    plot = plot_sr_centrality_individual,
    width = 12, height = 8,
    device = cairo_pdf, units = "in", dpi = 300
  )
}

We’re particularly interested in the manager pairs 1-7 and 18-21. On an individual-by-individual basis, does the model correctly predict, in each pair, which manager is represented as being more central? We’ll do a prevalence test to see whether the model made correct predictions for significantly greater than 50% of subjects.

sr_centrality %>%
  # Find and retain only relevant node-pairs
  mutate(
    pair_id = case_when(
      node_id %in% c(1, 7) ~ "1-7",
      node_id %in% c(18, 21) ~ "18-21"
    )
  ) %>%
  drop_na() %>%
  # Whichever manager is represented as being more central in each node-pair,
  # does the model make a correct prediction?
  group_by(sub_id, pair_id) %>%
  mutate(
    across(
      c(sr_centrality, perceived_centrality),
      ~.x - first(.x)
    )
  ) %>%
  filter(node_id %in% c(7, 21)) %>%
  summarise(
    correct = sr_centrality * perceived_centrality > 0,
    .groups = "drop"
  ) %>%
  # Tally correct predictions
  count(pair_id, correct) %>%
  # Prevalence test
  filter(correct) %>%
  split(.$pair_id) %>%
  map_dfr(
    .f = ~.x %>%
      select(n) %>%
      deframe() %>%
      binom.test(x = ., n = 21, p = 0.5, alternative = "greater") %>%
      tidy(),
    .id = "pair_id"
  ) %>%
  kable(caption = "Prevalence test (> 50%)")
Prevalence test (> 50%)
pair_id estimate statistic p.value parameter conf.low conf.high method alternative
1-7 0.7142857 15 0.0391769 21 0.5126112 1 Exact binomial test greater
18-21 0.8571429 18 0.0007448 21 0.6707890 1 Exact binomial test greater

We’d previously noted that betweenness seems to track discrepancies between a manager’s true versus perceived centrality. Does SR centrality track with betweenness? The loess curve suggests the relationship is nonlinear, but there is a strongly significant rank correlation between the two.

sr_centrality %>%
  group_by(name = node_id) %>%
  summarise(sr_centrality = mean(sr_centrality)) %>%
  left_join(centrality_differences) %>%
  ggplot(aes(x=sr_centrality, y=Betweenness)) +
  gg +
  geom_smooth(method = "loess", se = FALSE) +
  geom_label(aes(label = name)) +
  xlab("Successor Representation \u03B3 = 0.7")
## Joining with `by = join_by(name)`
## `geom_smooth()` using formula = 'y ~ x'

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "centrality_sr_vs_betweenness.pdf"),
    width = 6, height = 4,
    device = cairo_pdf, units = "in", dpi = 300
  )
}
## `geom_smooth()` using formula = 'y ~ x'
sr_centrality %>%
  group_by(name = node_id) %>%
  summarise(sr_centrality = mean(sr_centrality)) %>%
  left_join(centrality_differences) %>%
  with(cor.test(Betweenness, sr_centrality, method = "spearman", exact = FALSE)) %>%
  tidy() %>%
  kable(caption = "Betweenness vs SR gamma 0.7")
## Joining with `by = join_by(name)`
Betweenness vs SR gamma 0.7
estimate statistic p.value method alternative
0.8948052 162 0 Spearman’s rank correlation rho two.sided

Company leadership

So far, we have not provided the model with any privileged knowledge about the company’s senior leadership hierarchy. Yet, using observations constrained by the network structure, the model has identified most of the company’s leadership anyways.

The CEO is coded as node 7, and the four Vice Presidents are coded as 2, 14, 18, and 21. We see clear evidence that the model identifies nodes 2, 7, 18, and 21 as being central figures, but the model makes conspicuously incorrect predictions about manager 14.

Could the model in principle do a better job at identifying manager 14 as a central network member? To test this, we’ll make one additional assumption about managers’ observations: they observe the senior leadership twice as much as other network members.

sr_observations_leadership <- obs_edgelist_out %>%
  # Double-count CEO and VPs
  bind_rows(
    obs_edgelist_out %>%
      filter(from %in% c(2, 7, 14, 18, 21) | to %in% c(2, 7, 14, 18, 21))
  ) %>%
  expand_grid(rep = 1:100) %>%
  group_by(sub_id, rep) %>%
  slice_sample(prop = 1) %>%
  ungroup()

sr_leadership <- sr_observations_leadership %>%
  expand_grid(gamma = seq(0.1, 0.9, 0.1), .) %>%
  group_by(sub_id, gamma) %>%
  nest() %>%
  mutate(
    sr = map2(
      .x = data, .y = gamma,
      ~build_rep_sr(.x, this_alpha = 0.1, this_gamma = .y)
    )
  ) %>%
  unnest(sr) %>%
  ungroup() %>%
  select(-data) %>%
  mutate(gamma = gamma * 10) %>%
  pivot_wider(
    names_from = gamma,
    names_prefix = "sr",
    values_from = sr_value
  )

We’ll test whether the model goodness-of-fit is better at \(\gamma = 0.7\), which was the best-fitting model from before:

test_sr7_leadership <- krackhardt_allocentric %>%
  rename(choice = edge) %>%
  left_join(schema_experts_out, by = c("sub_id", "from", "to")) %>%
  left_join(sr_leadership, by = c("sub_id", "from", "to")) %>%
  glmer(
    choice ~ mem_value + sr7 + (1 + sr7 | sub_id),
    family = "binomial", data = .
  )

bind_rows(
  glance(test_sr7_out) %>% mutate(model = "sr7 standard"),
  glance(test_sr7_leadership) %>% mutate(model = "sr7 leadership 2x")
) %>%
  kable()
nobs sigma logLik AIC BIC deviance df.residual model
8400 1 -4363.029 8738.058 8780.274 8565.413 8394 sr7 standard
8400 1 -4301.586 8615.172 8657.388 8443.772 8394 sr7 leadership 2x
sr_centrality_leadership <- sr_leadership %>%
  # SR centrality from best variant
  filter(from != to) %>%
  group_by(sub_id, node_id = to) %>%
  summarise(sr_centrality = mean(sr7), .groups = "drop") %>%
  # Perceived centrality
  left_join(
    krackhardt_allocentric %>%
      group_by(sub_id, node_id = to) %>%
      summarise(perceived_centrality = mean(edge), .groups = "drop"),
    by = c("sub_id", "node_id")
  )

sr_centrality_corr_leadership <- sr_centrality_leadership %>%
  group_by(sub_id) %>%
  nest() %>%
  mutate(
    spearman = map_dbl(
      .x = data,
      .f = ~with(
        .x, cor(
          sr_centrality, perceived_centrality, method = "spearman"
        )
      )
    )
  ) %>%
  ungroup() %>%
  select(-data)

sr_centrality_corr_leadership %>%
  mutate(spearman = tanh(spearman)) %>%
  select(spearman) %>%
  deframe() %>%
  t.test() %>%
  tidy() %>%
  kable(
    caption = str_c(
      "t-test of the average Spearman correlation between perceived and ",
      "SR-predicted centrality (z-transformed)"
    )
  )
t-test of the average Spearman correlation between perceived and SR-predicted centrality (z-transformed)
estimate statistic p.value parameter conf.low conf.high method alternative
0.4581988 16.76413 0 20 0.4011851 0.5152126 One Sample t-test two.sided
sr_centrality_corr_leadership %>%
  summarise(mean_spearman = mean(spearman)) %>%
  kable()
mean_spearman
0.5059155
plot_sr_centrality_group_leadership <- sr_centrality_leadership %>%
  # Get each subject's rank-ordered perceived and SR-predicted centrality
  group_by(sub_id) %>%
  mutate(across(ends_with("_centrality"), ~rank(.x, ties.method = "min"))) %>%
  ungroup() %>%
  # Average across subjects to get group-level results
  group_by(node_id) %>%
  summarise(across(ends_with("_centrality"), mean)) %>%
  # For color-coding two pairs of managers with equivalent degree centrality,
  # but different perceived centrality
  mutate(
    colorcode = case_when(
      node_id %in% c(1, 7) ~ "a",
      node_id %in% c(18, 21) ~ "b"
    )
  ) %>%
  ggplot(aes(x=perceived_centrality, y=sr_centrality)) +
  gg +
  geom_smooth(method = "lm", se = FALSE, color = "black") +
  geom_label(aes(label = node_id, fill = colorcode), show.legend = FALSE) +
  scale_fill_manual(values = c("#a6cee3", "#b2df8a"), na.value = "grey95") +
  scale_x_continuous(name = "Perceived centrality (rank-ordered)") +
  scale_y_continuous(name = "SR centrality (rank-ordered)") +
  ggtitle("Perceived vs Successor Centrality (Group-Level)")

plot_sr_centrality_group_leadership
## `geom_smooth()` using formula = 'y ~ x'

plot_sr_centrality_individual_leadership <- sr_centrality_leadership %>%
  # Get each subject's rank-ordered perceived and SR-predicted centrality
  group_by(sub_id) %>%
  mutate(across(ends_with("_centrality"), ~rank(.x, ties.method = "min"))) %>%
  ungroup() %>%
  # Formatting for plotting
  pivot_longer(
    ends_with("_centrality"),
    names_to = "metric",
    values_to = "centrality"
  ) %>%
  mutate(
    node_id = factor(node_id),
    sub_id = str_pad(sub_id, width = 2, side = "left", pad = "0"),
    sub_id = str_c("Manager ", sub_id)
  ) %>%
  ggplot(aes(x=node_id, y=centrality, fill=metric)) +
  facet_wrap(~sub_id, nrow = 3) +
  geom_col(position = "identity", color = "black", alpha = 0.5) +
  coord_polar() +
  scale_fill_viridis_d(
    name = "Centrality",
    labels = c("Perceived (Empirical)", "Successor Representation (Simulated)")
  ) +
  ggtitle("Perceived vs Successor Centrality (Individual-Level)") +
  theme_bw() +
  theme(
    plot.title = element_text(hjust = 0.5, size = 13),
    plot.subtitle = element_text(hjust = 0.5, size = 11),
    panel.grid.minor = element_blank(),
    panel.spacing = unit(0.75, "lines"),
    legend.box.spacing = unit(0.5, "lines"),
    legend.margin = margin(c(0, 0, 0, 0), unit = "lines"),
    legend.position = "bottom",
    axis.title.x = element_blank(),
    axis.title.y = element_blank(),
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank()
  )

plot_sr_centrality_individual_leadership

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "sr_centrality_group_leadership.pdf"),
    plot = plot_sr_centrality_group_leadership,
    width = 6, height = 4,
    device = cairo_pdf, units = "in", dpi = 300
  )
  
  ggsave(
    here("outputs", "07-krackhardt", "sr_centrality_individual_leadership.pdf"),
    plot = plot_sr_centrality_individual_leadership,
    width = 12, height = 8,
    device = cairo_pdf, units = "in", dpi = 300
  )
}
## `geom_smooth()` using formula = 'y ~ x'

Figures for paper

network_for_paper <- krackhardt_g %>%
  mutate(
    Degree = centrality_degree(mode = "in"),
    node_colorcode = case_when(
      name %in% c(1, 7) ~ "a",
      name %in% c(18, 21) ~ "b"
    )
  ) %>%
  # To clean up some of the overplotting
  activate("edges") %>%
  mutate(
    edge_colorcode = case_when(
      from %in% c(1, 7) | to %in% c(1, 7) ~ "a",
      from %in% c(18, 21) | to %in% c(18, 21) ~ "b"
    )
  ) %>%
  # Plot the "ground truth" network
  ggraph(layout = "stress") +
  geom_edge_link(
    aes(
      start_cap = label_rect(node1.name, padding = margin(15, 15, 15, 15)),
      end_cap = label_rect(node2.name, padding = margin(15, 15, 15, 15)),
      color = edge_colorcode
    ),
    arrow = arrow(length = unit(5, "pt"), type = "closed"),
    show.legend = FALSE
  ) +
  geom_node_label(
    aes(label = name, size = Degree, fill = node_colorcode),
    show.legend = FALSE
  ) +
  scale_size(range = c(3, 10)) +
  scale_fill_manual(values = c("#a6cee3", "#b2df8a"), na.value = "grey95") +
  scale_edge_color_manual(
    values = c("#1f78b4", "#33a02c"), na.value = "grey80"
  ) +
  ggtitle("Advice Network (Krackhardt 1987)") +
  theme(
    panel.background = element_blank(),
    strip.background = element_blank(),
    panel.border = element_blank(),
    axis.title = element_blank(),
    axis.text = element_blank(),
    axis.ticks = element_blank(),
    plot.title = element_text(hjust = 0.5, size = 13),
    strip.text = element_text(hjust = 0.5, size = 13),
    legend.position = "bottom"
  )

network_for_paper

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "network_graph.pdf"),
    plot = network_for_paper,
    width = 12, height = 5,
    device = cairo_pdf, units = "in", dpi = 300
  )
}
representation_for_paper <- (
  (plot_choice + ggtitle("Perceived (Empirical)")) |
    plot_sr7_out + ggtitle("Successor Rep. \u03B3 = 0.7 (Simulated)")
) +
  plot_annotation(
    title = "Network Representation",
    theme = theme(plot.title = element_text(size = 13, hjust = 0.5))
  )

representation_for_paper

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "representation.pdf"),
    plot = representation_for_paper,
    width = 12, height = 5,
    device = cairo_pdf, units = "in", dpi = 300
  )
}
centrality_individual_for_paper <- sr_centrality %>%
  filter(sub_id %in% c(17, 19)) %>%
  # Get each subject's rank-ordered perceived and SR-predicted centrality
  group_by(sub_id) %>%
  mutate(across(ends_with("_centrality"), ~rank(.x, ties.method = "min"))) %>%
  ungroup() %>%
  # Formatting for plotting
  pivot_longer(
    ends_with("_centrality"),
    names_to = "metric",
    values_to = "centrality"
  ) %>%
  mutate(
    node_id = factor(node_id),
    sub_id = str_pad(sub_id, width = 2, side = "left", pad = "0"),
    sub_id = str_c("Manager ", sub_id)
  ) %>%
  ggplot(aes(x=node_id, y=centrality, fill=metric)) +
  facet_wrap(~sub_id) +
  geom_col(position = "identity", color = "black", alpha = 0.5) +
  coord_polar() +
  scale_fill_viridis_d(
    name = "Centrality",
    labels = c("Perceived (Empirical)", "Successor Rep. (Simulated)")
  ) +
  ggtitle("Individual-Level") +
  theme_bw() +
  theme(
    plot.title = element_text(hjust = 0.5, size = 13),
    plot.subtitle = element_text(hjust = 0.5, size = 11),
    panel.grid.minor = element_blank(),
    panel.spacing = unit(0.75, "lines"),
    legend.box.spacing = unit(0.5, "lines"),
    legend.margin = margin(c(0, 0, 0, 0), unit = "lines"),
    legend.position = "bottom",
    axis.title.x = element_blank(),
    axis.title.y = element_blank(),
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank()
  )

centrality_for_paper <- (
  (
    plot_sr_centrality_group +
      ggtitle("Group-Level") +
      scale_y_continuous(name = "Successor Rep. centrality (rank-ordered)")
  ) |
    centrality_individual_for_paper
) +
  plot_annotation(
    title = "Perceived vs Successor Centrality",
    theme = theme(plot.title = element_text(size = 13, hjust = 0.5))
  )
## Scale for y is already present.
## Adding another scale for y, which will replace the existing scale.
centrality_for_paper
## `geom_smooth()` using formula = 'y ~ x'

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "centrality.pdf"),
    plot = centrality_for_paper,
    width = 12, height = 5,
    device = cairo_pdf, units = "in", dpi = 300
  )
}
## `geom_smooth()` using formula = 'y ~ x'

Save data

It was a bit of a pain getting the raw data into shape, so we’ll save the cleaned versions for posterity.

if (knitting) {
  krackhardt_egocentric %>%
    write_csv(here("data", "krackhardt", "clean_egocentric.csv"))
  
  krackhardt_allocentric %>%
    rename(choice = edge) %>%
    write_csv(here("data", "krackhardt", "clean_allocentric.csv"))
}

Session info

sessionInfo()
## R version 4.3.1 (2023-06-16)
## Platform: aarch64-apple-darwin20 (64-bit)
## Running under: macOS Ventura 13.5.2
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: America/New_York
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] broom.mixed_0.2.9.4 lmerTest_3.1-3      lme4_1.1-34        
##  [4] Matrix_1.6-1        patchwork_1.1.3     ggraph_2.1.0       
##  [7] tidygraph_1.2.3     here_1.0.1          lubridate_1.9.2    
## [10] forcats_1.0.0       stringr_1.5.0       dplyr_1.1.2        
## [13] purrr_1.0.2         readr_2.1.4         tidyr_1.3.0        
## [16] tibble_3.2.1        ggplot2_3.4.3       tidyverse_2.0.0    
## 
## loaded via a namespace (and not attached):
##  [1] tidyselect_1.2.0    viridisLite_0.4.2   farver_2.1.1       
##  [4] viridis_0.6.4       fastmap_1.1.1       tweenr_2.0.2       
##  [7] digest_0.6.33       timechange_0.2.0    lifecycle_1.0.3    
## [10] magrittr_2.0.3      compiler_4.3.1      rlang_1.1.1        
## [13] sass_0.4.7          tools_4.3.1         igraph_1.5.1       
## [16] utf8_1.2.3          yaml_2.3.7          knitr_1.43         
## [19] labeling_0.4.2      graphlayouts_1.0.0  bit_4.0.5          
## [22] abind_1.4-5         withr_2.5.0         numDeriv_2016.8-1.1
## [25] grid_4.3.1          polyclip_1.10-4     fansi_1.0.4        
## [28] colorspace_2.1-0    future_1.33.0       globals_0.16.2     
## [31] scales_1.2.1        MASS_7.3-60         cli_3.6.1          
## [34] crayon_1.5.2        rmarkdown_2.24      generics_0.1.3     
## [37] tzdb_0.4.0          minqa_1.2.5         cachem_1.0.8       
## [40] ggforce_0.4.1       splines_4.3.1       parallel_4.3.1     
## [43] vctrs_0.6.3         boot_1.3-28.1       jsonlite_1.8.7     
## [46] carData_3.0-5       car_3.1-2           hms_1.1.3          
## [49] bit64_4.0.5         ggrepel_0.9.3       listenv_0.9.0      
## [52] jquerylib_0.1.4     glue_1.6.2          parallelly_1.36.0  
## [55] nloptr_2.0.3        codetools_0.2-19    stringi_1.7.12     
## [58] gtable_0.3.3        munsell_0.5.0       pillar_1.9.0       
## [61] furrr_0.3.1         htmltools_0.5.6     R6_2.5.1           
## [64] rprojroot_2.0.3     vroom_1.6.3         evaluate_0.21      
## [67] lattice_0.21-8      highr_0.10          backports_1.4.1    
## [70] broom_1.0.5         bslib_0.5.1         Rcpp_1.0.11        
## [73] gridExtra_2.3       nlme_3.1-162        mgcv_1.8-42        
## [76] xfun_0.40           fs_1.6.3            pkgconfig_2.0.3
---
title: "Reanalyze Krackhardt 1987"
output:
  html_document:
    code_download: true
    code_folding: hide
    toc: true
    toc_float:
      collapsed: true
---

In his 1987 paper (Cognitive Social Structures), Krackhardt includes data from a 21-member network of managers (all male) at a tech manufacturing company. These data include not only egocentric reports (these are the people *I* go to for advice), but also allocentric reports (I think John often asks Bill for advice). We can try testing whether multistep relational abstraction explains the subjects' allocentric beliefs in this dataset.

# Setup

```{r setup}
# Set a random seed for reproducibility
set.seed(sum(utf8ToInt("krackhardt")))

library(tidyverse)
library(here)
library(tidygraph)
library(ggraph)
library(patchwork)
library(lme4)
library(lmerTest)
library(broom.mixed)

kable <- knitr::kable
vif <- car::vif

knitting <- knitr::is_html_output()

if (knitting) {
  # Create directories for saving stuff
  if (!dir.exists(here("outputs"))) {
    dir.create(here("outputs"))
  }
  
  if (!dir.exists(here("outputs", "07-krackhardt"))) {
    dir.create(here("outputs", "07-krackhardt"))
  }
}

# Pull in some modeling tools
source(here("code", "utils", "representation_utils.R"))
source(here("code", "utils", "network_utils.R"))

gg <- list(
  theme_bw(),
  theme(
    plot.title = element_text(hjust = 0.5, size = 13),
    plot.subtitle = element_text(hjust = 0.5, size = 11),
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank(),
    panel.spacing = unit(0.75, "lines"),
    legend.box.spacing = unit(0.5, "lines"),
    legend.margin = margin(c(0, 0, 0, 0), unit = "lines")
  )
)

plot_representation <- function(rep_df, relation_value, n_breaks = 3) {
  plot_df <- rep_df %>%
    filter(from != to) %>%
    group_by(from, to) %>%
    summarise(value = mean({{relation_value}}), .groups = "drop")
  
  plot_df %>%
    # Plot
    mutate(
      across(c(from, to), factor),
      from = fct_rev(from)
    ) %>%
    ggplot(aes(x=to, y=from, fill=value)) +
    gg +
    geom_tile() +
    coord_fixed() +
    scale_fill_viridis_c(
      option = "magma", name = NULL, na.value = "white", n.breaks = n_breaks
    ) +
    theme(
      legend.position = "bottom",
      axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1)
    )
}

check_significance <- function(tidy_stats) {
  tidy_stats %>%
    mutate(
      significance = case_when(
        p.value < 0.001 ~ "***",
        p.value < 0.01 ~ "**",
        p.value < 0.05 ~ "*",
        p.value < 0.1 ~ ".",
        TRUE ~ ""
      )
    )
}
```

These data are included in Krackhardt 1987, Appendix A. Each allocentric CSV contains an adjacency matrix, so we'll need to do a little work to get them into a tidy-friendly format.

The "egocentric" network is defined in the way that social networks are most often defined: based on each subject's answer to the question, "Who are you connected to?" Oftentimes, this question is phrased to measure friendships (e.g., "Are you friends with X?"). In this particular dataset, the question is something closer to "Who do you ask for advice?"

The "allocentric" cognitive maps measure each subject's answer to the question, "Does X ask Y for advice?" This includes the special case of, "Does X ask you for advice?" Here's the rationale for including the special case in allocentric maps. Imagine for a moment that this were a study about friendships instead of advice-giving. In that case, we would consider "Are you friends with X?" to be a question that subjects would know the answer to (or at least, they would know better than anyone else). On the other hand, we would consider the question "Does X consider you a friend?" to be more of an inference than knowledge. I don't know whether X considers me a friend; I don't consider X to be a friend, but we do occasionally eat lunch together. Moreover, in cases of disagreement (X says he asks Y for advice; Y doesn't agree), the "ground truth" egocentric network would then hinge on the researcher's decision to use an intersection vs union rule to resolve disagreements; Krackhardt notes this in his paper. By alternatively defining the egocentric network based solely on the responses that subjects would have greatest introspective access to (their own advice-seeking behavior), we sidestep this researcher degree of freedom.

```{r clean-data}
krackhardt_raw <- here("data", "krackhardt") %>%
  fs::dir_ls(regexp = "krackhardt_tech_managers_sub[[:digit:]]{2}\\.csv") %>%
  map_dfr(
    .f = ~read_csv(.x, col_names = FALSE, show_col_types = FALSE),
    .id = "filename"
  ) %>%
  mutate(
    sub_id = str_extract(filename, "sub[[:digit:]]{2}"),
    sub_id = str_remove(sub_id, "sub"),
    sub_id = as.numeric(sub_id)
  ) %>%
  select(-filename) %>%
  group_by(sub_id) %>%
  mutate(from = row_number()) %>%
  ungroup() %>%
  pivot_longer(
    cols = -c(sub_id, from),
    names_to = "to",
    values_to = "edge"
  ) %>%
  mutate(
    to = str_remove(to, "X"),
    to = as.integer(to)
  )

krackhardt_egocentric <- krackhardt_raw %>%
  filter(sub_id == from, from != to) %>%
  select(-sub_id)

krackhardt_allocentric <- krackhardt_raw %>%
  filter(sub_id != from, from != to)
```


# Plot data

Okay, let's try and get a sense for what the "egocentric" network looks like.

```{r network-graph}
krackhardt_g <- krackhardt_egocentric %>%
  filter(edge == 1) %>%
  select(-edge) %>% 
  tbl_graph(edges = .) %>%
  mutate(name = row_number())

plot_network_graph <- krackhardt_g %>%
  mutate(Degree = centrality_degree(mode = "in")) %>%
  # Plot the "ground truth" network
  ggraph(layout = "stress") +
  geom_edge_link(
    aes(
      start_cap = label_rect(node1.name, padding = margin(15, 15, 15, 15)),
      end_cap = label_rect(node2.name, padding = margin(15, 15, 15, 15)),
    ),
    arrow = arrow(length = unit(5, "pt"), type = "closed")
  ) +
  geom_node_label(
    aes(label = name, size = Degree),
    fill = "grey95", show.legend = FALSE
  ) +
  scale_size(range = c(3, 10)) +
  theme(
    panel.background = element_blank(),
    strip.background = element_blank(),
    panel.border = element_blank(),
    axis.title = element_blank(),
    axis.text = element_blank(),
    axis.ticks = element_blank(),
    plot.title = element_text(hjust = 0.5, size = 13),
    strip.text = element_text(hjust = 0.5, size = 13),
    legend.position = "bottom"
  )

plot_network_graph
```

We can also plot this as a matrix instead of a graph, which will ease comparison with the cognitive strategies we'll consider next.

```{r network-adjacency}
plot_network_adj <- krackhardt_egocentric %>%
  plot_representation(edge, n_breaks = 2) +
  ggtitle("Egocentric Network")

plot_network_adj
```

How does this compare to subjects' mental representation of the network?

```{r plot-representation}
plot_choice <- krackhardt_allocentric %>%
  plot_representation(relation_value = edge) +
  ggtitle("Network Representation")

plot_network_adj | plot_choice
```

In the original paper, Krackhardt makes the following note: "There are some centers of focus for advice, notably 2 and 21 (both are vice presidents) ... In [the subjectively perceived network], 21 loses his prominence as a major recipient of nominations: instead, 18, 14, and 7 (the president) appear to be more central." To get a quantitative sense for this in our data, we can try calculating the "perceived" degree centrality by summing over the matrix columns, then comparing that against the "true" in-degree centrality.

```{r calc-centrality-diff}
centrality_differences <- krackhardt_g %>%
  mutate(
    Degree = centrality_degree(mode = "in"),
    Betweenness = centrality_betweenness(directed = TRUE)
  ) %>%
  as_tibble() %>%
  # Add represented in-degree; standardize sum by number of network members
  left_join(
    krackhardt_allocentric %>%
      group_by(name = to) %>%
      summarise(Perceived = sum(edge)/21, .groups = "drop"),
    by = c("name")
  ) %>%
  select(name, Perceived, everything()) %>%
  arrange(desc(Degree), desc(Perceived))
```

We can see that the same number of managers go to subjects 18 and 21 for advice (15 each), but that subject 18 is perceived as being much more "popular" (averaged across subjects, 11.95 managers are perceived to go to him for advice) than subject 21 (averaged across subjects, 8.38 managers are perceived to go to him for advice). A similar phenomenon happens with subjects 1 and 7, where subject 1 is perceived as being much less popular than subject 7, despite their having the same "true" popularity.

```{r centrality-diff-degree}
centrality_differences %>%
  ggplot(aes(x=Degree, y=Perceived)) +
  gg +
  geom_smooth(method = "lm", se = FALSE) +
  geom_label(aes(label = name)) +
  scale_x_continuous(name = "True (degree) centrality") +
  scale_y_continuous(name = "Perceived centrality")

centrality_differences %>%
  select(name, Degree, Perceived) %>%
  kable()
```

Is it possible that subjects' network representations might be reflecting some other centrality metric? In the original paper, Krackhardt notes that betweenness seems to have some predictive power. For the 18-21 and 1-7 test cases, we see that betweenness centrality does make the "correct" prediction about which network member is represented as being more important. But, we should note that betweenness doesn't do the greatest job predicting perceived popularity across the board; its greatest predictive power is in explaining the popularity discrepancy for the most popular managers.

```{r centrality-differences-between}
centrality_differences %>%
  pivot_longer(cols = Degree:Betweenness, names_to = "metric") %>%
  ggplot(aes(x=value, y=Perceived)) +
  gg +
  facet_wrap(~metric, scales = "free_x") +
  geom_smooth(method = "lm", se = FALSE) +
  geom_label(aes(label = name)) +
  scale_x_continuous(name = "True centrality") +
  scale_y_continuous(name = "Perceived centrality")

centrality_differences %>%
  kable()
```


# Simulated observations

We know that an individual's embedding within a network dictates what interactions and/or relationships they're able to observe. Therefore, when building predicted cognitive maps, we should use a different set of interactions to inform each subject's predicted representations. Of course, we have no idea what individual subjects' observations actually are. So we'll need to come up with a compact and reasonable set of assumptions about what observations were most likely available to individual network members.

Here are three key assumptions we'll make here:

1. We'll assume (trivially) that people can observe interactions between themselves and their immediate advisors (and/or advisees).
2. We'll also assume that people are generally able to observe interactions between advisors-of-advisors (and/or advisees-of-advisees), either through firsthand observation or secondhand chatter.
3. Though of course it's plausible that people might observe more distant interactions, we'll assume this happens with enough infrequency that we can ignore those interactions for the sake of modeling.

We've left open an ambiguity in those assumptions: are people observing advisors-of-advisors, and/or advisees-of-advisees? To aid understanding, we'll use the following example. Suppose we're looking at a subset of a network consisting of the grad student Abby, who asks the postdoc Betty for advice; in turn, Betty the postdoc asks Cathy the assistant professor for advice; and Cathy asks Dani the associate professor for advice. There are three reasonable ways that observations might work in this network:

1. Observations go "out": When Abby asks Betty for advice, Betty responds by sharing the wisdom she's gained from asking Cathy for advice. That leads Abby to observe that Betty goes to Cathy for advice.
2. Observations go "in": Betty has asked Cathy for advice. Cathy thinks Betty has asked a great question that she doesn't know the answer to, so Cathy asks Dani about it. That leads Dani to observe that Betty goes to Cathy for advice.
3. It goes both ways.

We may be able to test this empirically. We would need to simulate a set of observations consistent with all three possibilities, generate model predictions from those simulations, then compare how well those models fit managers' actual perceptions of the network.

```{r def-observations}
# Observations go both ways
obs_edgelist_all <- expand_grid(
  sub_id = 1:21,
  # Start with all one-step egocentric observations
  krackhardt_egocentric
) %>%
  # Convert adjlist into edgelist
  filter(edge == 1) %>%
  select(-edge) %>%
  # For a given node, we want to know who their "advisors-of-advisors" are
  # (or equivalently, "advisees-of-advisees")
  left_join(
    krackhardt_g %>%
      mutate(neighbors = local_members(order = 1, mode = "all")) %>%
      as_tibble(),
    by = c("sub_id" = "name")
  ) %>%
  # We assume that nodes are able to observe (directly or through chatter)
  # advice-giving interactions from their advisors and advisors-of-advisors
  # (or equivalently, advisees and advisees-of-advisees)
  rowwise() %>%
  filter((from %in% neighbors) | (to %in% neighbors)) %>%
  ungroup() %>%
  select(-neighbors) %>%
  # Make pretty
  arrange(sub_id, from)

# Advisees observe who advises their advisors
obs_edgelist_out <- expand_grid(
  sub_id = 1:21,
  krackhardt_egocentric
) %>%
  filter(edge == 1) %>%
  select(-edge) %>%
  left_join(
    krackhardt_g %>%
      mutate(neighbors = local_members(order = 1, mode = "out")) %>%
      as_tibble(),
    by = c("sub_id" = "name")
  ) %>%
  rowwise() %>%
  filter((from %in% neighbors) | (to %in% neighbors)) %>%
  ungroup() %>%
  select(-neighbors) %>%
  arrange(sub_id, from)

# Advisors observe who their advisees advise
obs_edgelist_in <- expand_grid(
  sub_id = 1:21,
  krackhardt_egocentric
) %>%
  filter(edge == 1) %>%
  select(-edge) %>%
  left_join(
    krackhardt_g %>%
      mutate(neighbors = local_members(order = 1, mode = "in")) %>%
      as_tibble(),
    by = c("sub_id" = "name")
  ) %>%
  rowwise() %>%
  filter((from %in% neighbors) | (to %in% neighbors)) %>%
  ungroup() %>%
  select(-neighbors) %>%
  arrange(sub_id, from)

obs_adjlist_all <- expand_grid(sub_id = 1:21, from = sub_id, to = sub_id) %>%
  left_join(
    obs_edgelist_all %>% mutate(edge = 1),
    by = c("sub_id", "from", "to")
  ) %>%
  mutate(edge = if_else(is.na(edge), 0, edge))

obs_adjlist_out <- expand_grid(sub_id = 1:21, from = sub_id, to = sub_id) %>%
  left_join(
    obs_edgelist_out %>% mutate(edge = 1),
    by = c("sub_id", "from", "to")
  ) %>%
  mutate(edge = if_else(is.na(edge), 0, edge))

obs_adjlist_in <- expand_grid(sub_id = 1:21, from = sub_id, to = sub_id) %>%
  left_join(
    obs_edgelist_in %>% mutate(edge = 1),
    by = c("sub_id", "from", "to")
  ) %>%
  mutate(edge = if_else(is.na(edge), 0, edge))
```


# Schema-based representation

If people were using triad or community schemas/heuristics to inform representation, what would this look like? Using literally the exact same tools that we used to analyze the butterfly network, let's create some predicted representations.

```{r schema-experts}
schema_experts_all <- obs_adjlist_all %>%
  # Memorization
  group_by(sub_id, from) %>%
  mutate(
    mem_value = edge / sum(edge),
    mem_value = if_else(is.nan(mem_value), 0, mem_value)
  ) %>%
  ungroup() %>%
  # Now triads
  left_join(
    obs_adjlist_all %>%
      group_by(sub_id) %>%
      nest() %>%
      mutate(triad = map(data, ~build_rep_triad(.x, "edge"))) %>%
      unnest(triad) %>%
      ungroup() %>%
      select(-c(data, transition_scaling)) %>%
      mutate(triad_value = if_else(is.nan(triad_value), 0, triad_value)),
    by = c("sub_id", "from", "to")
  ) %>%
  # Finally communities
  left_join(
    obs_adjlist_all %>%
      group_by(sub_id) %>%
      nest() %>%
      mutate(comm = map(data, ~build_rep_community(.x, "edge"))) %>%
      unnest(comm) %>%
      ungroup() %>%
      select(-c(data, transition_scaling)),
    by = c("sub_id", "from", "to")
  )

schema_experts_out <- obs_adjlist_out %>%
  # Memorization
  group_by(sub_id, from) %>%
  mutate(
    mem_value = edge / sum(edge),
    mem_value = if_else(is.nan(mem_value), 0, mem_value)
  ) %>%
  ungroup() %>%
  # Now triads
  left_join(
    obs_adjlist_out %>%
      group_by(sub_id) %>%
      nest() %>%
      mutate(triad = map(data, ~build_rep_triad(.x, "edge"))) %>%
      unnest(triad) %>%
      ungroup() %>%
      select(-c(data, transition_scaling)) %>%
      mutate(triad_value = if_else(is.nan(triad_value), 0, triad_value)),
    by = c("sub_id", "from", "to")
  ) %>%
  # Finally communities
  left_join(
    obs_adjlist_out %>%
      group_by(sub_id) %>%
      nest() %>%
      mutate(comm = map(data, ~build_rep_community(.x, "edge"))) %>%
      unnest(comm) %>%
      ungroup() %>%
      select(-c(data, transition_scaling)),
    by = c("sub_id", "from", "to")
  )

schema_experts_in <- obs_adjlist_in %>%
  # Memorization
  group_by(sub_id, from) %>%
  mutate(
    mem_value = edge / sum(edge),
    mem_value = if_else(is.nan(mem_value), 0, mem_value)
  ) %>%
  ungroup() %>%
  # Now triads
  left_join(
    obs_adjlist_in %>%
      group_by(sub_id) %>%
      nest() %>%
      mutate(triad = map(data, ~build_rep_triad(.x, "edge"))) %>%
      unnest(triad) %>%
      ungroup() %>%
      select(-c(data, transition_scaling)) %>%
      mutate(triad_value = if_else(is.nan(triad_value), 0, triad_value)),
    by = c("sub_id", "from", "to")
  ) %>%
  # Finally communities
  left_join(
    obs_adjlist_in %>%
      group_by(sub_id) %>%
      nest() %>%
      mutate(comm = map(data, ~build_rep_community(.x, "edge"))) %>%
      unnest(comm) %>%
      ungroup() %>%
      select(-c(data, transition_scaling)),
    by = c("sub_id", "from", "to")
  )
```

```{r plot-schema-all}
plot_mem_all <- plot_representation(schema_experts_all, mem_value) +
  ggtitle("Memorization")

plot_triad_all <- plot_representation(schema_experts_all, triad_value) +
  ggtitle("Triad Completion")

plot_comm_all <- plot_representation(schema_experts_all, comm_value) +
  ggtitle("Community Detection")

plot_schemas_all <- (
  plot_choice | plot_mem_all | plot_triad_all | plot_comm_all
) +
  plot_annotation(
    title = "Observations of advice-giving go both ways",
    theme = theme(plot.title = element_text(size = 15, hjust = 0.5))
  )

plot_schemas_all

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "schema_gallery_all.pdf"),
    plot_schemas_all,
    width = 12, height = 4,
    device = cairo_pdf, units = "in", dpi = 300
  )
}
```

```{r plot-schema-out}
plot_mem_out <- plot_representation(schema_experts_out, mem_value) +
  ggtitle("Memorization")

plot_triad_out <- plot_representation(schema_experts_out, triad_value) +
  ggtitle("Triad Completion")

plot_comm_out <- plot_representation(schema_experts_out, comm_value) +
  ggtitle("Community Detection")

plot_schemas_out <- (
  plot_choice | plot_mem_out | plot_triad_out | plot_comm_out
) +
  plot_annotation(
    title = "Advisees observe advisors' advisors",
    theme = theme(plot.title = element_text(size = 15, hjust = 0.5))
  )

plot_schemas_out

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "schema_gallery_out.pdf"),
    plot_schemas_out,
    width = 12, height = 4,
    device = cairo_pdf, units = "in", dpi = 300
  )
}
```

```{r plot-schema-in}
plot_mem_in <- plot_representation(schema_experts_in, mem_value) +
  ggtitle("Memorization")

plot_triad_in <- plot_representation(schema_experts_in, triad_value) +
  ggtitle("Triad Completion")

plot_comm_in <- plot_representation(schema_experts_in, comm_value) +
  ggtitle("Community Detection")

plot_schemas_in <- (
  plot_choice | plot_mem_in | plot_triad_in | plot_comm_in
) +
  plot_annotation(
    title = "Advisors observe advisees' advisees",
    theme = theme(plot.title = element_text(size = 15, hjust = 0.5))
  )

plot_schemas_in

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "schema_gallery_in.pdf"),
    plot_schemas_in,
    width = 12, height = 4,
    device = cairo_pdf, units = "in", dpi = 300
  )
}
```

At the group-level... schemas make frankly terrible predictions, no matter what we assume about observations. Note that if a particular subject is predicted to ask all other managers for advice, the row associated with that subject will be filled with a value close to 1/20=0.05. So we can see that the triad completion strategy predicts that most managers are connected, and the community detection strategy predicts that literally all managers are connected. Neither comes even a little bit close to the actual allocentric network representation.


# Successor repesentation

Let's say people use multistep relational abstraction to represent networks, and let's use the SR to approximate that kind of abstraction. Using the observations we generated before, let's create lots of learning "trials" to train the algorithm.

```{r create-sr-obs}
sr_observations_all <- obs_edgelist_all %>%
  # Create simulated "learning trials" to train an asymptotic SR
  expand_grid(rep = 1:100) %>%
  # Randomize "order of trials" per repetition
  group_by(sub_id, rep) %>%
  slice_sample(prop = 1) %>%
  ungroup()

sr_observations_out <- obs_edgelist_out %>%
  expand_grid(rep = 1:100) %>%
  group_by(sub_id, rep) %>%
  slice_sample(prop = 1) %>%
  ungroup()

sr_observations_in <- obs_edgelist_in %>%
  expand_grid(rep = 1:100) %>%
  group_by(sub_id, rep) %>%
  slice_sample(prop = 1) %>%
  ungroup()
```

And now we can use those to build SRs at different scales. We'll assume that every subject is using the same successor horizon...

```{r compute-sr}
sr_all <- sr_observations_all %>%
  expand_grid(gamma = seq(0.1, 0.9, 0.1), .) %>%
  group_by(sub_id, gamma) %>%
  nest() %>%
  mutate(
    sr = map2(
      .x = data, .y = gamma,
      ~build_rep_sr(.x, this_alpha = 0.1, this_gamma = .y)
    )
  ) %>%
  unnest(sr) %>%
  ungroup() %>%
  select(-data) %>%
  mutate(gamma = gamma * 10) %>%
  pivot_wider(
    names_from = gamma,
    names_prefix = "sr",
    values_from = sr_value
  )

sr_out <- sr_observations_out %>%
  expand_grid(gamma = seq(0.1, 0.9, 0.1), .) %>%
  group_by(sub_id, gamma) %>%
  nest() %>%
  mutate(
    sr = map2(
      .x = data, .y = gamma,
      ~build_rep_sr(.x, this_alpha = 0.1, this_gamma = .y)
    )
  ) %>%
  unnest(sr) %>%
  ungroup() %>%
  select(-data) %>%
  mutate(gamma = gamma * 10) %>%
  pivot_wider(
    names_from = gamma,
    names_prefix = "sr",
    values_from = sr_value
  )

sr_in <- sr_observations_in %>%
  expand_grid(gamma = seq(0.1, 0.9, 0.1), .) %>%
  group_by(sub_id, gamma) %>%
  nest() %>%
  mutate(
    sr = map2(
      .x = data, .y = gamma,
      ~build_rep_sr(.x, this_alpha = 0.1, this_gamma = .y)
    )
  ) %>%
  unnest(sr) %>%
  ungroup() %>%
  select(-data) %>%
  mutate(gamma = gamma * 10) %>%
  pivot_wider(
    names_from = gamma,
    names_prefix = "sr",
    values_from = sr_value
  )
```

Let's get a sense for the predictions when the SR is trained on observations that go both ways.

```{r plot-sr-all}
plot_sr1_all <- sr_all %>%
  plot_representation(relation_value = sr1, n_breaks = 4) +
  ggtitle("\u03B3 = 0.1")

plot_sr3_all <- sr_all %>%
  plot_representation(relation_value = sr3) +
  ggtitle("\u03B3 = 0.3")

plot_sr5_all <- sr_all %>%
  plot_representation(relation_value = sr5) +
  ggtitle("\u03B3 = 0.5")

plot_sr7_all <- sr_all %>%
  plot_representation(relation_value = sr7) +
  ggtitle("\u03B3 = 0.7")

plot_sr9_all <- sr_all %>%
  plot_representation(relation_value = sr9) +
  ggtitle("\u03B3 = 0.9")

plot_sr_gallery_all <- (
  plot_choice +
    plot_sr1_all + plot_sr3_all + plot_sr5_all + plot_sr7_all + plot_sr9_all
)

plot_sr_gallery_all

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "sr_gallery_all.pdf"),
    plot_sr_gallery_all,
    width = 12, height = 8,
    device = cairo_pdf, units = "in", dpi = 300
  )
}
```

And now where advisees observe their advisors' advisors...

```{r plot-sr-out}
plot_sr1_out <- sr_out %>%
  plot_representation(relation_value = sr1, n_breaks = 4) +
  ggtitle("\u03B3 = 0.1")

plot_sr3_out <- sr_out %>%
  plot_representation(relation_value = sr3) +
  ggtitle("\u03B3 = 0.3")

plot_sr5_out <- sr_out %>%
  plot_representation(relation_value = sr5) +
  ggtitle("\u03B3 = 0.5")

plot_sr7_out <- sr_out %>%
  plot_representation(relation_value = sr7) +
  ggtitle("\u03B3 = 0.7")

plot_sr9_out <- sr_out %>%
  plot_representation(relation_value = sr9) +
  ggtitle("\u03B3 = 0.9")

plot_sr_gallery_out <- (
  plot_choice +
    plot_sr1_out + plot_sr3_out + plot_sr5_out + plot_sr7_out + plot_sr9_out
)

plot_sr_gallery_out

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "sr_gallery_out.pdf"),
    plot_sr_gallery_out,
    width = 12, height = 8,
    device = cairo_pdf, units = "in", dpi = 300
  )
}
```

And finally where advisors observe their advisees' advisees...

```{r plot-sr-in}
plot_sr1_in <- sr_in %>%
  plot_representation(relation_value = sr1, n_breaks = 4) +
  ggtitle("\u03B3 = 0.1")

plot_sr3_in <- sr_in %>%
  plot_representation(relation_value = sr3) +
  ggtitle("\u03B3 = 0.3")

plot_sr5_in <- sr_in %>%
  plot_representation(relation_value = sr5) +
  ggtitle("\u03B3 = 0.5")

plot_sr7_in <- sr_in %>%
  plot_representation(relation_value = sr7) +
  ggtitle("\u03B3 = 0.7")

plot_sr9_in <- sr_in %>%
  plot_representation(relation_value = sr9) +
  ggtitle("\u03B3 = 0.9")

plot_sr_gallery_in <- (
  plot_choice +
    plot_sr1_in + plot_sr3_in + plot_sr5_in + plot_sr7_in + plot_sr9_in
)

plot_sr_gallery_in

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "sr_gallery_in.pdf"),
    plot_sr_gallery_in,
    width = 12, height = 8,
    device = cairo_pdf, units = "in", dpi = 300
  )
}
```


# Statistical tests

Okay, now that we have all of the predicted strategies in hand, we'll want to perform some formal statistical tests to verify that the SR is truly doing the best at explaining subjects' network representations. Let's create a dataframe with all of the info we'll need.

```{r define-test-dataset}
test_dataset_all <- krackhardt_allocentric %>%
  rename(choice = edge) %>%
  left_join(schema_experts_all, by = c("sub_id", "from", "to")) %>%
  left_join(sr_all, by = c("sub_id", "from", "to"))

test_dataset_out <- krackhardt_allocentric %>%
  rename(choice = edge) %>%
  left_join(schema_experts_out, by = c("sub_id", "from", "to")) %>%
  left_join(sr_out, by = c("sub_id", "from", "to"))

test_dataset_in <- krackhardt_allocentric %>%
  rename(choice = edge) %>%
  left_join(schema_experts_in, by = c("sub_id", "from", "to")) %>%
  left_join(sr_in, by = c("sub_id", "from", "to"))
```

We'll do all of the statistical tests for the "observations go both ways" models. Note that we can't actually estimate a model for community detection, as this strategy predicts that everyone is connected to everyone.

```{r test-obs-all}
test_mem_all <- test_dataset_all %>%
  glmer(
    choice ~ mem_value + (1 + mem_value | sub_id),
    family = "binomial", data = .
  )

test_triad_all <- test_dataset_all %>%
  glmer(
    choice ~ mem_value + triad_value + (1 + triad_value | sub_id),
    family = "binomial", data = .
  )

# test_comm_all <- test_dataset_all %>%
#   glmer(
#     choice ~ mem_value + comm_value + (1 + comm_value | sub_id),
#     family = "binomial", data = .
#   )

test_sr1_all <- test_dataset_all %>%
  glmer(
    choice ~ mem_value + sr1 + (1 + sr1 | sub_id),
    family = "binomial", data = .
  )

test_sr2_all <- test_dataset_all %>%
  glmer(
    choice ~ mem_value + sr2 + (1 + sr2 | sub_id),
    family = "binomial", data = .
  )

test_sr3_all <- test_dataset_all %>%
  glmer(
    choice ~ mem_value + sr3 + (1 + sr3 | sub_id),
    family = "binomial", data = .
  )

test_sr4_all <- test_dataset_all %>%
  glmer(
    choice ~ mem_value + sr4 + (1 + sr4 | sub_id),
    family = "binomial", data = .
  )

test_sr5_all <- test_dataset_all %>%
  glmer(
    choice ~ mem_value + sr5 + (1 + sr5 | sub_id),
    family = "binomial", data = .
  )

test_sr6_all <- test_dataset_all %>%
  glmer(
    choice ~ mem_value + sr6 + (1 + sr6 | sub_id),
    family = "binomial", data = .
  )

test_sr7_all <- test_dataset_all %>%
  glmer(
    choice ~ mem_value + sr7 + (1 + sr7 | sub_id),
    family = "binomial", data = .
  )

test_sr8_all <- test_dataset_all %>%
  glmer(
    choice ~ mem_value + sr8 + (1 + sr8 | sub_id),
    family = "binomial", data = .
  )

test_sr9_all <- test_dataset_all %>%
  glmer(
    choice ~ mem_value + sr9 + (1 + sr9 | sub_id),
    family = "binomial", data = .
  )
```

Moving on to the "advisees observe advisors' advisors" models.

```{r test-obs-out}
test_mem_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + (1 + mem_value | sub_id),
    family = "binomial", data = .
  )

test_triad_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + triad_value + (1 + triad_value | sub_id),
    family = "binomial", data = .
  )

test_comm_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + comm_value + (1 + comm_value | sub_id),
    family = "binomial", data = .
  )

test_sr1_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + sr1 + (1 + sr1 | sub_id),
    family = "binomial", data = .
  )

test_sr2_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + sr2 + (1 + sr2 | sub_id),
    family = "binomial", data = .
  )

test_sr3_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + sr3 + (1 + sr3 | sub_id),
    family = "binomial", data = .
  )

test_sr4_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + sr4 + (1 + sr4 | sub_id),
    family = "binomial", data = .
  )

test_sr5_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + sr5 + (1 + sr5 | sub_id),
    family = "binomial", data = .
  )

test_sr6_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + sr6 + (1 + sr6 | sub_id),
    family = "binomial", data = .
  )

test_sr7_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + sr7 + (1 + sr7 | sub_id),
    family = "binomial", data = .
  )

test_sr8_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + sr8 + (1 + sr8 | sub_id),
    family = "binomial", data = .
  )

test_sr9_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + sr9 + (1 + sr9 | sub_id),
    family = "binomial", data = .
  )
```

Finally, the "advisors observe advisees' advisees" models. Once again, we can't estimate a model for community detection, as this strategy predicts that everyone is connected to everyone.

```{r test-obs-in}
test_mem_in <- test_dataset_in %>%
  glmer(
    choice ~ mem_value + (1 + mem_value | sub_id),
    family = "binomial", data = .
  )

test_triad_in <- test_dataset_in %>%
  glmer(
    choice ~ mem_value + triad_value + (1 + triad_value | sub_id),
    family = "binomial", data = .
  )

# test_comm_in <- test_dataset_in %>%
#   glmer(
#     choice ~ mem_value + comm_value + (1 + comm_value | sub_id),
#     family = "binomial", data = .
#   )

test_sr1_in <- test_dataset_in %>%
  glmer(
    choice ~ mem_value + sr1 + (1 + sr1 | sub_id),
    family = "binomial", data = .
  )

test_sr2_in <- test_dataset_in %>%
  glmer(
    choice ~ mem_value + sr2 + (1 + sr2 | sub_id),
    family = "binomial", data = .
  )

test_sr3_in <- test_dataset_in %>%
  glmer(
    choice ~ mem_value + sr3 + (1 + sr3 | sub_id),
    family = "binomial", data = .
  )

test_sr4_in <- test_dataset_in %>%
  glmer(
    choice ~ mem_value + sr4 + (1 + sr4 | sub_id),
    family = "binomial", data = .
  )

test_sr5_in <- test_dataset_in %>%
  glmer(
    choice ~ mem_value + sr5 + (1 + sr5 | sub_id),
    family = "binomial", data = .
  )

test_sr6_in <- test_dataset_in %>%
  glmer(
    choice ~ mem_value + sr6 + (1 + sr6 | sub_id),
    family = "binomial", data = .
  )

test_sr7_in <- test_dataset_in %>%
  glmer(
    choice ~ mem_value + sr7 + (1 + sr7 | sub_id),
    family = "binomial", data = .
  )

test_sr8_in <- test_dataset_in %>%
  glmer(
    choice ~ mem_value + sr8 + (1 + sr8 | sub_id),
    family = "binomial", data = .
  )

test_sr9_in <- test_dataset_in %>%
  glmer(
    choice ~ mem_value + sr9 + (1 + sr9 | sub_id),
    family = "binomial", data = .
  )
```

Of all of these models, which has the "best" fit, based on BIC?

```{r goodness-of-fit}
# Which has the best "goodness-of-fit"?
goodness_of_fit <- bind_rows(
  # All: goes both ways
  glance(test_mem_all) %>% mutate(model = "mem_all"),
  glance(test_triad_all) %>% mutate(model = "triad_all"),
  glance(test_sr1_all) %>% mutate(model = "sr1_all"),
  glance(test_sr2_all) %>% mutate(model = "sr2_all"),
  glance(test_sr3_all) %>% mutate(model = "sr3_all"),
  glance(test_sr4_all) %>% mutate(model = "sr4_all"),
  glance(test_sr5_all) %>% mutate(model = "sr5_all"),
  glance(test_sr6_all) %>% mutate(model = "sr6_all"),
  glance(test_sr7_all) %>% mutate(model = "sr7_all"),
  glance(test_sr8_all) %>% mutate(model = "sr8_all"),
  glance(test_sr9_all) %>% mutate(model = "sr9_all"),
  # Out: advisors of advisors
  glance(test_mem_out) %>% mutate(model = "mem_out"),
  glance(test_triad_out) %>% mutate(model = "triad_out"),
  glance(test_comm_out) %>% mutate(model = "comm_out"),
  glance(test_sr1_out) %>% mutate(model = "sr1_out"),
  glance(test_sr2_out) %>% mutate(model = "sr2_out"),
  glance(test_sr3_out) %>% mutate(model = "sr3_out"),
  glance(test_sr4_out) %>% mutate(model = "sr4_out"),
  glance(test_sr5_out) %>% mutate(model = "sr5_out"),
  glance(test_sr6_out) %>% mutate(model = "sr6_out"),
  glance(test_sr7_out) %>% mutate(model = "sr7_out"),
  glance(test_sr8_out) %>% mutate(model = "sr8_out"),
  glance(test_sr9_out) %>% mutate(model = "sr9_out"),
  # In: advisees of advisees
  glance(test_mem_in) %>% mutate(model = "mem_in"),
  glance(test_triad_in) %>% mutate(model = "triad_in"),
  glance(test_sr1_in) %>% mutate(model = "sr1_in"),
  glance(test_sr2_in) %>% mutate(model = "sr2_in"),
  glance(test_sr3_in) %>% mutate(model = "sr3_in"),
  glance(test_sr4_in) %>% mutate(model = "sr4_in"),
  glance(test_sr5_in) %>% mutate(model = "sr5_in"),
  glance(test_sr6_in) %>% mutate(model = "sr6_in"),
  glance(test_sr7_in) %>% mutate(model = "sr7_in"),
  glance(test_sr8_in) %>% mutate(model = "sr8_in"),
  glance(test_sr9_in) %>% mutate(model = "sr9_in"),
) %>%
  select(model, everything()) %>%
  arrange(BIC) %>%
  # Make this a little more human-readable
  separate(model, into = c("model", "direction")) %>%
  mutate(
    direction = case_when(
      direction == "out" ~ "advisee observes advisor of advisor",
      direction == "in" ~ "advisor observes advisee of advisee",
      direction == "all" ~ "both ways"
    )
  )

goodness_of_fit %>%
  mutate(
    model = fct_relevel(
      model,
      "sr1", "sr2", "sr3", "sr4", "sr5", "sr6", "sr7", "sr8", "sr9",
      "mem", "triad", "comm"
    )
  ) %>%
  ggplot(aes(x = model, y = BIC, color = direction)) +
  gg +
  geom_point() +
  ggtitle("Model goodness-of-fit") +
  theme(legend.position = "bottom")

goodness_of_fit %>%
  kable()
```

Let's take a look at the best-fitting model estimates:

```{r best-model}
test_sr7_out %>%
  tidy(conf.int = TRUE) %>%
  kable()

vif(test_sr7_out)
```



# SR centrality

Can the best-fitting SR explain an individual's perceptions of other network members' centrality?

```{r def-sr-centrality}
sr_centrality <- sr_out %>%
  # SR centrality from best variant
  filter(from != to) %>%
  group_by(sub_id, node_id = to) %>%
  summarise(sr_centrality = mean(sr7), .groups = "drop") %>%
  # Perceived centrality
  left_join(
    krackhardt_allocentric %>%
      group_by(sub_id, node_id = to) %>%
      summarise(perceived_centrality = mean(edge), .groups = "drop"),
    by = c("sub_id", "node_id")
  )
```

To answer this, we'll do a random-effects analysis where we take each individual's idiosyncratic mental representation of their network, then rank-correlate it with the SR's individual-level predictions.

```{r test-sr-centrality}
sr_centrality_corr <- sr_centrality %>%
  group_by(sub_id) %>%
  nest() %>%
  mutate(
    spearman = map_dbl(
      .x = data,
      .f = ~with(
        .x, cor(sr_centrality, perceived_centrality, method = "spearman")
      )
    )
  ) %>%
  ungroup() %>%
  select(-data)

sr_centrality_corr %>%
  mutate(spearman = tanh(spearman)) %>%
  select(spearman) %>%
  deframe() %>%
  t.test() %>%
  tidy() %>%
  kable(
    caption = str_c(
      "t-test of the average Spearman correlation between perceived and ",
      "SR-predicted centrality (z-transformed)"
    )
  )

sr_centrality_corr %>%
  summarise(mean_spearman = mean(spearman)) %>%
  kable()
```

Let's try to visualize these results in an intuitive format.

```{r plot-sr-centrality}
plot_sr_centrality_group <- sr_centrality %>%
  # Get each subject's rank-ordered perceived and SR-predicted centrality
  group_by(sub_id) %>%
  mutate(across(ends_with("_centrality"), ~rank(.x, ties.method = "min"))) %>%
  ungroup() %>%
  # Average across subjects to get group-level results
  group_by(node_id) %>%
  summarise(across(ends_with("_centrality"), mean)) %>%
  # For color-coding two pairs of managers with equivalent degree centrality,
  # but different perceived centrality
  mutate(
    colorcode = case_when(
      node_id %in% c(1, 7) ~ "a",
      node_id %in% c(18, 21) ~ "b"
    )
  ) %>%
  ggplot(aes(x=perceived_centrality, y=sr_centrality)) +
  gg +
  geom_smooth(method = "lm", se = FALSE, color = "black") +
  geom_label(aes(label = node_id, fill = colorcode), show.legend = FALSE) +
  scale_fill_manual(values = c("#a6cee3", "#b2df8a"), na.value = "grey95") +
  scale_x_continuous(name = "Perceived centrality (rank-ordered)") +
  scale_y_continuous(name = "SR centrality (rank-ordered)") +
  ggtitle("Perceived vs Successor Centrality (Group-Level)")

plot_sr_centrality_group

plot_sr_centrality_individual <- sr_centrality %>%
  # Get each subject's rank-ordered perceived and SR-predicted centrality
  group_by(sub_id) %>%
  mutate(across(ends_with("_centrality"), ~rank(.x, ties.method = "min"))) %>%
  ungroup() %>%
  # Formatting for plotting
  pivot_longer(
    ends_with("_centrality"),
    names_to = "metric",
    values_to = "centrality"
  ) %>%
  mutate(
    node_id = factor(node_id),
    sub_id = str_pad(sub_id, width = 2, side = "left", pad = "0"),
    sub_id = str_c("Manager ", sub_id)
  ) %>%
  ggplot(aes(x=node_id, y=centrality, fill=metric)) +
  facet_wrap(~sub_id, nrow = 3) +
  geom_col(position = "identity", color = "black", alpha = 0.5) +
  coord_polar() +
  scale_fill_viridis_d(
    name = "Centrality",
    labels = c("Perceived (Empirical)", "Successor Representation (Simulated)")
  ) +
  ggtitle("Perceived vs Successor Centrality (Individual-Level)") +
  theme_bw() +
  theme(
    plot.title = element_text(hjust = 0.5, size = 13),
    plot.subtitle = element_text(hjust = 0.5, size = 11),
    panel.grid.minor = element_blank(),
    panel.spacing = unit(0.75, "lines"),
    legend.box.spacing = unit(0.5, "lines"),
    legend.margin = margin(c(0, 0, 0, 0), unit = "lines"),
    legend.position = "bottom",
    axis.title.x = element_blank(),
    axis.title.y = element_blank(),
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank()
  )

plot_sr_centrality_individual

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "sr_centrality_individual_standard.pdf"),
    plot = plot_sr_centrality_individual,
    width = 12, height = 8,
    device = cairo_pdf, units = "in", dpi = 300
  )
}
```

We're particularly interested in the manager pairs 1-7 and 18-21. On an individual-by-individual basis, does the model correctly predict, in each pair, which manager is represented as being more central? We'll do a prevalence test to see whether the model made correct predictions for significantly greater than 50% of subjects.

```{r prevalence-test}
sr_centrality %>%
  # Find and retain only relevant node-pairs
  mutate(
    pair_id = case_when(
      node_id %in% c(1, 7) ~ "1-7",
      node_id %in% c(18, 21) ~ "18-21"
    )
  ) %>%
  drop_na() %>%
  # Whichever manager is represented as being more central in each node-pair,
  # does the model make a correct prediction?
  group_by(sub_id, pair_id) %>%
  mutate(
    across(
      c(sr_centrality, perceived_centrality),
      ~.x - first(.x)
    )
  ) %>%
  filter(node_id %in% c(7, 21)) %>%
  summarise(
    correct = sr_centrality * perceived_centrality > 0,
    .groups = "drop"
  ) %>%
  # Tally correct predictions
  count(pair_id, correct) %>%
  # Prevalence test
  filter(correct) %>%
  split(.$pair_id) %>%
  map_dfr(
    .f = ~.x %>%
      select(n) %>%
      deframe() %>%
      binom.test(x = ., n = 21, p = 0.5, alternative = "greater") %>%
      tidy(),
    .id = "pair_id"
  ) %>%
  kable(caption = "Prevalence test (> 50%)")
```

We'd previously noted that betweenness seems to track discrepancies between a manager's true versus perceived centrality. Does SR centrality track with betweenness? The loess curve suggests the relationship is nonlinear, but there is a strongly significant rank correlation between the two.

```{r test-sr-betweenness}
sr_centrality %>%
  group_by(name = node_id) %>%
  summarise(sr_centrality = mean(sr_centrality)) %>%
  left_join(centrality_differences) %>%
  ggplot(aes(x=sr_centrality, y=Betweenness)) +
  gg +
  geom_smooth(method = "loess", se = FALSE) +
  geom_label(aes(label = name)) +
  xlab("Successor Representation \u03B3 = 0.7")

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "centrality_sr_vs_betweenness.pdf"),
    width = 6, height = 4,
    device = cairo_pdf, units = "in", dpi = 300
  )
}

sr_centrality %>%
  group_by(name = node_id) %>%
  summarise(sr_centrality = mean(sr_centrality)) %>%
  left_join(centrality_differences) %>%
  with(cor.test(Betweenness, sr_centrality, method = "spearman", exact = FALSE)) %>%
  tidy() %>%
  kable(caption = "Betweenness vs SR gamma 0.7")
```


# Company leadership

So far, we have not provided the model with any privileged knowledge about the company's senior leadership hierarchy. Yet, using observations constrained by the network structure, the model has identified most of the company's leadership anyways.

The CEO is coded as node 7, and the four Vice Presidents are coded as 2, 14, 18, and 21. We see clear evidence that the model identifies nodes 2, 7, 18, and 21 as being central figures, but the model makes conspicuously incorrect predictions about manager 14.

Could the model in principle do a better job at identifying manager 14 as a central network member? To test this, we'll make one additional assumption about managers' observations: they observe the senior leadership twice as much as other network members.

```{r leadership-compute-sr}
sr_observations_leadership <- obs_edgelist_out %>%
  # Double-count CEO and VPs
  bind_rows(
    obs_edgelist_out %>%
      filter(from %in% c(2, 7, 14, 18, 21) | to %in% c(2, 7, 14, 18, 21))
  ) %>%
  expand_grid(rep = 1:100) %>%
  group_by(sub_id, rep) %>%
  slice_sample(prop = 1) %>%
  ungroup()

sr_leadership <- sr_observations_leadership %>%
  expand_grid(gamma = seq(0.1, 0.9, 0.1), .) %>%
  group_by(sub_id, gamma) %>%
  nest() %>%
  mutate(
    sr = map2(
      .x = data, .y = gamma,
      ~build_rep_sr(.x, this_alpha = 0.1, this_gamma = .y)
    )
  ) %>%
  unnest(sr) %>%
  ungroup() %>%
  select(-data) %>%
  mutate(gamma = gamma * 10) %>%
  pivot_wider(
    names_from = gamma,
    names_prefix = "sr",
    values_from = sr_value
  )
```

We'll test whether the model goodness-of-fit is better at $\gamma = 0.7$, which was the best-fitting model from before:

```{r leadership-bic}
test_sr7_leadership <- krackhardt_allocentric %>%
  rename(choice = edge) %>%
  left_join(schema_experts_out, by = c("sub_id", "from", "to")) %>%
  left_join(sr_leadership, by = c("sub_id", "from", "to")) %>%
  glmer(
    choice ~ mem_value + sr7 + (1 + sr7 | sub_id),
    family = "binomial", data = .
  )

bind_rows(
  glance(test_sr7_out) %>% mutate(model = "sr7 standard"),
  glance(test_sr7_leadership) %>% mutate(model = "sr7 leadership 2x")
) %>%
  kable()
```

```{r leadership-centrality}
sr_centrality_leadership <- sr_leadership %>%
  # SR centrality from best variant
  filter(from != to) %>%
  group_by(sub_id, node_id = to) %>%
  summarise(sr_centrality = mean(sr7), .groups = "drop") %>%
  # Perceived centrality
  left_join(
    krackhardt_allocentric %>%
      group_by(sub_id, node_id = to) %>%
      summarise(perceived_centrality = mean(edge), .groups = "drop"),
    by = c("sub_id", "node_id")
  )

sr_centrality_corr_leadership <- sr_centrality_leadership %>%
  group_by(sub_id) %>%
  nest() %>%
  mutate(
    spearman = map_dbl(
      .x = data,
      .f = ~with(
        .x, cor(
          sr_centrality, perceived_centrality, method = "spearman"
        )
      )
    )
  ) %>%
  ungroup() %>%
  select(-data)

sr_centrality_corr_leadership %>%
  mutate(spearman = tanh(spearman)) %>%
  select(spearman) %>%
  deframe() %>%
  t.test() %>%
  tidy() %>%
  kable(
    caption = str_c(
      "t-test of the average Spearman correlation between perceived and ",
      "SR-predicted centrality (z-transformed)"
    )
  )

sr_centrality_corr_leadership %>%
  summarise(mean_spearman = mean(spearman)) %>%
  kable()
```

```{r leadership-plot-centrality}
plot_sr_centrality_group_leadership <- sr_centrality_leadership %>%
  # Get each subject's rank-ordered perceived and SR-predicted centrality
  group_by(sub_id) %>%
  mutate(across(ends_with("_centrality"), ~rank(.x, ties.method = "min"))) %>%
  ungroup() %>%
  # Average across subjects to get group-level results
  group_by(node_id) %>%
  summarise(across(ends_with("_centrality"), mean)) %>%
  # For color-coding two pairs of managers with equivalent degree centrality,
  # but different perceived centrality
  mutate(
    colorcode = case_when(
      node_id %in% c(1, 7) ~ "a",
      node_id %in% c(18, 21) ~ "b"
    )
  ) %>%
  ggplot(aes(x=perceived_centrality, y=sr_centrality)) +
  gg +
  geom_smooth(method = "lm", se = FALSE, color = "black") +
  geom_label(aes(label = node_id, fill = colorcode), show.legend = FALSE) +
  scale_fill_manual(values = c("#a6cee3", "#b2df8a"), na.value = "grey95") +
  scale_x_continuous(name = "Perceived centrality (rank-ordered)") +
  scale_y_continuous(name = "SR centrality (rank-ordered)") +
  ggtitle("Perceived vs Successor Centrality (Group-Level)")

plot_sr_centrality_group_leadership

plot_sr_centrality_individual_leadership <- sr_centrality_leadership %>%
  # Get each subject's rank-ordered perceived and SR-predicted centrality
  group_by(sub_id) %>%
  mutate(across(ends_with("_centrality"), ~rank(.x, ties.method = "min"))) %>%
  ungroup() %>%
  # Formatting for plotting
  pivot_longer(
    ends_with("_centrality"),
    names_to = "metric",
    values_to = "centrality"
  ) %>%
  mutate(
    node_id = factor(node_id),
    sub_id = str_pad(sub_id, width = 2, side = "left", pad = "0"),
    sub_id = str_c("Manager ", sub_id)
  ) %>%
  ggplot(aes(x=node_id, y=centrality, fill=metric)) +
  facet_wrap(~sub_id, nrow = 3) +
  geom_col(position = "identity", color = "black", alpha = 0.5) +
  coord_polar() +
  scale_fill_viridis_d(
    name = "Centrality",
    labels = c("Perceived (Empirical)", "Successor Representation (Simulated)")
  ) +
  ggtitle("Perceived vs Successor Centrality (Individual-Level)") +
  theme_bw() +
  theme(
    plot.title = element_text(hjust = 0.5, size = 13),
    plot.subtitle = element_text(hjust = 0.5, size = 11),
    panel.grid.minor = element_blank(),
    panel.spacing = unit(0.75, "lines"),
    legend.box.spacing = unit(0.5, "lines"),
    legend.margin = margin(c(0, 0, 0, 0), unit = "lines"),
    legend.position = "bottom",
    axis.title.x = element_blank(),
    axis.title.y = element_blank(),
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank()
  )

plot_sr_centrality_individual_leadership

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "sr_centrality_group_leadership.pdf"),
    plot = plot_sr_centrality_group_leadership,
    width = 6, height = 4,
    device = cairo_pdf, units = "in", dpi = 300
  )
  
  ggsave(
    here("outputs", "07-krackhardt", "sr_centrality_individual_leadership.pdf"),
    plot = plot_sr_centrality_individual_leadership,
    width = 12, height = 8,
    device = cairo_pdf, units = "in", dpi = 300
  )
}
```

# Figures for paper

```{r paper-network-graph}
network_for_paper <- krackhardt_g %>%
  mutate(
    Degree = centrality_degree(mode = "in"),
    node_colorcode = case_when(
      name %in% c(1, 7) ~ "a",
      name %in% c(18, 21) ~ "b"
    )
  ) %>%
  # To clean up some of the overplotting
  activate("edges") %>%
  mutate(
    edge_colorcode = case_when(
      from %in% c(1, 7) | to %in% c(1, 7) ~ "a",
      from %in% c(18, 21) | to %in% c(18, 21) ~ "b"
    )
  ) %>%
  # Plot the "ground truth" network
  ggraph(layout = "stress") +
  geom_edge_link(
    aes(
      start_cap = label_rect(node1.name, padding = margin(15, 15, 15, 15)),
      end_cap = label_rect(node2.name, padding = margin(15, 15, 15, 15)),
      color = edge_colorcode
    ),
    arrow = arrow(length = unit(5, "pt"), type = "closed"),
    show.legend = FALSE
  ) +
  geom_node_label(
    aes(label = name, size = Degree, fill = node_colorcode),
    show.legend = FALSE
  ) +
  scale_size(range = c(3, 10)) +
  scale_fill_manual(values = c("#a6cee3", "#b2df8a"), na.value = "grey95") +
  scale_edge_color_manual(
    values = c("#1f78b4", "#33a02c"), na.value = "grey80"
  ) +
  ggtitle("Advice Network (Krackhardt 1987)") +
  theme(
    panel.background = element_blank(),
    strip.background = element_blank(),
    panel.border = element_blank(),
    axis.title = element_blank(),
    axis.text = element_blank(),
    axis.ticks = element_blank(),
    plot.title = element_text(hjust = 0.5, size = 13),
    strip.text = element_text(hjust = 0.5, size = 13),
    legend.position = "bottom"
  )

network_for_paper

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "network_graph.pdf"),
    plot = network_for_paper,
    width = 12, height = 5,
    device = cairo_pdf, units = "in", dpi = 300
  )
}
```

```{r paper-representation}
representation_for_paper <- (
  (plot_choice + ggtitle("Perceived (Empirical)")) |
    plot_sr7_out + ggtitle("Successor Rep. \u03B3 = 0.7 (Simulated)")
) +
  plot_annotation(
    title = "Network Representation",
    theme = theme(plot.title = element_text(size = 13, hjust = 0.5))
  )

representation_for_paper

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "representation.pdf"),
    plot = representation_for_paper,
    width = 12, height = 5,
    device = cairo_pdf, units = "in", dpi = 300
  )
}
```

```{r paper-centrality}
centrality_individual_for_paper <- sr_centrality %>%
  filter(sub_id %in% c(17, 19)) %>%
  # Get each subject's rank-ordered perceived and SR-predicted centrality
  group_by(sub_id) %>%
  mutate(across(ends_with("_centrality"), ~rank(.x, ties.method = "min"))) %>%
  ungroup() %>%
  # Formatting for plotting
  pivot_longer(
    ends_with("_centrality"),
    names_to = "metric",
    values_to = "centrality"
  ) %>%
  mutate(
    node_id = factor(node_id),
    sub_id = str_pad(sub_id, width = 2, side = "left", pad = "0"),
    sub_id = str_c("Manager ", sub_id)
  ) %>%
  ggplot(aes(x=node_id, y=centrality, fill=metric)) +
  facet_wrap(~sub_id) +
  geom_col(position = "identity", color = "black", alpha = 0.5) +
  coord_polar() +
  scale_fill_viridis_d(
    name = "Centrality",
    labels = c("Perceived (Empirical)", "Successor Rep. (Simulated)")
  ) +
  ggtitle("Individual-Level") +
  theme_bw() +
  theme(
    plot.title = element_text(hjust = 0.5, size = 13),
    plot.subtitle = element_text(hjust = 0.5, size = 11),
    panel.grid.minor = element_blank(),
    panel.spacing = unit(0.75, "lines"),
    legend.box.spacing = unit(0.5, "lines"),
    legend.margin = margin(c(0, 0, 0, 0), unit = "lines"),
    legend.position = "bottom",
    axis.title.x = element_blank(),
    axis.title.y = element_blank(),
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank()
  )

centrality_for_paper <- (
  (
    plot_sr_centrality_group +
      ggtitle("Group-Level") +
      scale_y_continuous(name = "Successor Rep. centrality (rank-ordered)")
  ) |
    centrality_individual_for_paper
) +
  plot_annotation(
    title = "Perceived vs Successor Centrality",
    theme = theme(plot.title = element_text(size = 13, hjust = 0.5))
  )

centrality_for_paper

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "centrality.pdf"),
    plot = centrality_for_paper,
    width = 12, height = 5,
    device = cairo_pdf, units = "in", dpi = 300
  )
}
```


# Save data

It was a bit of a pain getting the raw data into shape, so we'll save the cleaned versions for posterity.

```{r save-data}
if (knitting) {
  krackhardt_egocentric %>%
    write_csv(here("data", "krackhardt", "clean_egocentric.csv"))
  
  krackhardt_allocentric %>%
    rename(choice = edge) %>%
    write_csv(here("data", "krackhardt", "clean_allocentric.csv"))
}
```


# Session info

```{r session-info}
sessionInfo()
```

